Click any tag below to further narrow down your results
Links
The article discusses how to optimize the FDA's drug event dataset, which is stored as large, nested JSON files, by normalizing repeated fields, particularly pharm_class_epc. By extracting these values into a separate lookup table and using integer IDs, the author significantly improved query performance and reduced memory usage in DuckDB, transforming slow, resource-intensive queries into fast, efficient ones.