Click any tag below to further narrow down your results
Links
The article discusses ScyllaDB's capabilities for vector similarity search, highlighting its performance benchmarks with a dataset of 1 billion vectors. It details how the architecture achieves low latency and high throughput while simplifying operations by integrating structured and unstructured data. Two scenarios are outlined, showcasing different trade-offs between recall and latency.
ClickHouse has implemented QBit, a new column type that allows flexible vector searches by storing floats as bit planes. This innovation lets users adjust precision and performance at query time, improving efficiency without the need for upfront decisions.
Lance is a modern columnar data format designed for machine learning workflows, offering significantly faster random access and features like zero-cost schema evolution and rich secondary indices. It integrates with popular data tools such as Pandas, DuckDB, and Pyarrow, making it ideal for applications like search engines, large-scale ML training, and managing complex datasets. Lance's design optimizes data handling across various stages of machine learning development, outperforming traditional formats like Parquet and JSON in multiple scenarios.