mirror of https://github.com/sysown/proxysql
Transform genai_demo_event.cpp from skeleton to working POC that: - Integrates with real llama-server on port 8013 for embeddings - Uses shared memory (passing pointers, not copying data) - Supports single or multiple documents per request - Properly transfers memory ownership between GenAI and client Architecture changes: - Document struct: passed by pointer from client to GenAI - RequestHeader: includes document_count and operation type - ResponseHeader: includes embedding_size and embedding_ptr - EmbeddingResult: allocated by GenAI, owned by client after response libcurl integration: - HTTP POST to llama-server embedding API - JSON parsing of embedding responses - Error handling for network/API failures Key features: - Clients wait for response before sending next request - (ensures document pointers remain valid) - GenAI workers handle multiple concurrent requests - Embedding dimension: 1023 floats (from llama-server) - Processing time: 30-250ms (real API latency) Results: - 5 clients completed 9 embedding requests - All embeddings successfully retrieved - Zero-copy data transfer via shared memory pointers - Early termination when all work completed Future-ready: - Operation enum (OP_EMBEDDING, OP_COMPLETION, OP_RAG) - Extensible for other GenAI operations - Document count supports batch processingpull/5310/head
parent
012142eeed
commit
2c0f3a2e64
Binary file not shown.
File diff suppressed because it is too large
Load Diff
Loading…
Reference in new issue