The December 2019 launch of STARR-OMOP, Stanford School of Medicine’s (SoM) next generation analytical clinical data warehouse, was the culmination of a three year journey to push the frontiers of artificial intelligence in medicine. STARR-OMOP stores about seven terabytes of Stanford Electronic Health Record data from its two hospitals in a Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM). By standardizing this huge archive and making it secure, private, and easier to access, the data warehouse enables faster and better data science. In October 2020 a COVID-19 network study published in Nature Communications became the first peer-reviewed article based on STARR-OMOP.
STARR-OMOP can be accessed by Stanford researchers through Nero, a secure, HIPAA-compliant internal data platform for scientific research that integrates with Google Cloud’s enterprise data warehouse, BigQuery. Using BigQuery, customers are able to analyze petabyte scale data immediately and empower their data analysts to run queries on the data with zero operational overhead. “Cohort queries run ten to one hundred times faster on BigQuery when compared to an on-premise database–and it’s cost efficient. Even better is a managed service that scales with our users, automagically,” says Somalee Datta, Director of Research IT at SoM. Over 120 data scientists now have access to STARR-OMOP through Nero.