About Conftrace
Conftrace is an open browsing interface for 220 881 research papers from 36 top CS/AI conferences: NeurIPS, ICML, ICLR, CVPR, ICCV, ECCV, ACL, EMNLP, NAACL, AAAI, IJCAI, MICCAI, Interspeech, and 23 more venues.
What you can do here
- Search 220k papers by title, conference, year, topic or keyword.
- Browse 36 conferences with per-year paper counts, top authors, and DNA fingerprints.
- Explore a 290-topic taxonomy across the field, classified by an LLM.
- Walk through 17 000 keywords with co-occurrence graphs.
- Track emerging terms — year-over-year growth from paper titles.
- Visualise the graph — 8 interactive presets including Keyword Galaxy, Research Bridges, Time Machine, and Topic Landscape.
Who is behind this
Conftrace is built and maintained by Justyna Wojtczak (github.com/justi) as a personal research project. The site is open source: github.com/justi/research-explorer.
The data layer (scrapers, abstract extraction, taxonomy classification) is a continuation of Andrej Karpathy's 2014 researchpooler project. Karpathy outlined a "Stage 4 web interface" in the original README; Conftrace is that interface, finally built more than a decade later.
How is this different from Google Scholar / Semantic Scholar?
- Scoped, not universal. We only index the 36 best CS/AI venues, no preprints, no journals, no random aggregators. The signal-to-noise is high.
- Taxonomy-first. Every paper is classified into a 3-level taxonomy of ~290 topics by an LLM with chain-of-thought reasoning. You can browse the field by structure, not just by keyword.
- Co-occurrence graphs. 17k keywords linked through shared papers, queryable as a graph. Find research bridges between disjoint areas.
- Free and open. No paywall, no login, no ads. Code and data pipeline are on GitHub.
Limitations & caveats
- The sample is top-tier CS/AI conferences, not all of research. Trends and achievements describe "among 36 venues", not "in the world".
- ~16% of papers have no abstract (mostly pre-2015 publications without HTML pages). These are excluded from the public listing and search index but remain reachable by direct URL.
- Classification accuracy on papers with abstract is ~98% (validated against held-out samples); on title-only it drops to ~85%.
- Data refresh cadence: scrapers run when new editions are announced (see MAINTENANCE.md for details).
See also: Methodology for technical details on data collection and classification.