DeepMind · Gemini · research @ scale
- Gemini
- DeepMind
- AlphaFold-class R&D
Primary literature · systems research
Tie each landmark PDF to what teams ship: frontier labs and silicon, SageMaker / Vertex-style training planes, embedding databases behind RAG evals, and Cursor-style agents that operationalize tool-use research.
These are the teams, publications, and stacks researchers cite when arguing about scaling, safety, hardware, and deployment—your reading map stays grounded in real shipping systems.
DeepMind · Gemini · research @ scale
FAIR · open weights · PyTorch lineage
Frontier APIs · reasoning evals
Interpretability · constitutional RLHF
Research · Azure AI science
CUDA graphs · inference · NeMo research
Accelerators · edge AI science
MLX · on-device ML research
IBM Research · enterprise AI science
Tools · planners · grounded retrieval
Vertex AI, SageMaker, and open frameworks are how teams turn ablation studies into repeatable numbers—matching the methods sections you read (distributed training, tracking, registry, serving).
Vertex AI · managed training & batch
SageMaker · Bedrock research sandboxes
Azure ML · enterprise MLOps
Lakehouse · MLflow lineage
Reproducible envs · containerized training
Atlas · vectors · AI workload data
Dynamic graphs · distributed research code
Graphs · TPU / XLA pathways
Open weights · PEFT · leaderboards
Experiment tracking · sweeps · model registry
ANN indexes, hybrid filters, and replication—the stacks RAG papers implicitly benchmark when they claim retrieval-augmented gains.
Managed vectors · namespaces · metadata filters
GraphQL vectors · modular retrieval
Rust core · filtering-heavy RAG
Open vectors · billion-scale ANN
Embedded UX · rapid RAG prototypes
pgvector · relational + embeddings
RediSearch · low-latency vectors
dense_vector · lexical + semantic
Open lineage · k-NN serving
Ranking · tensors · hybrid serving
Cursor-style agents and Windsurf-class flows operationalize tool-use papers—multi-file edits, terminals, and PR-aware refactors beside every landmark PDF.
Composer · codebase-wide agent edits
Cascade · Codeium IDE lineage
Copilot Chat · workspace agents
Extensions hub · Copilot host IDE
Autocomplete · IDE-native reasoning
AI Assistant · Junie · Fleet
Agent · Ghostwriter · hosted shells
Keyboard-first · LSP + AI plugins
Notebook GPUs · Gemini coding UX
Cloud workspaces · prebuild parity
Deep AI ML Research Lab
Architectures, training dynamics, evaluation, and deployment—plus how today’s stacks combine RAG, tools, and agentic reasoning. Papers are anchors; the through-line is practitioner-grade AI research literacy.
Frontier topics · primary literature
Six curated lanes where industry moved fastest—each links into our PDF viewer with reading maps and literacy notes so you engage like a practitioner, not a tourist.
Retrieve evidence first, generate second—the blueprint for factual assistants and enterprise copilots.
Open landmark paperThought traces paired with actions—inspect trajectories instead of praying for one-shot answers.
Open landmark paperSparse delegation to calculators, search, and APIs—production AI stacks route exactly like this.
Open landmark paperContrastive alignment unlocked zero-shot vision classifiers—upstream of diffusion conditioning.
Open landmark paperLow-rank adapters keep frozen foundations—how teams ship vertical AI without cloning GPT-scale weights.
Open landmark paperScratchpads before answers—minimal math for maximal gains before RL-heavy agent trainers.
Open landmark paperReading journeys
These are intentional sequences—not a generic playlist. Follow one path to build a coherent mental model: how ideas cite, critique, and replace each other in modern ML and AI systems research.
Encoder–decoder intuition → self-attention → pre-training at scale.
CNN watershed moments → residual depth → diffusion fundamentals.
Deep RL from pixels → planning under uncertainty.
Prompted reasoning → language-conditioned actions → learned tool invocation.
Three entry points from our curated set—open any paper in the lab viewer, then follow the method & experiments thread.
Evidence layer · tiered
Filters aren’t gates—they’re pacing guides. Move up when experiment sections feel familiar and limitations spark ideas instead of confusion.
Readable landmarks that anchor intuition—CNNs, sequences, games, and early generative ideas.
ImageNet Classification with Deep Convolutional Neural Networks
Generative Adversarial Nets
Sequence to Sequence Learning with Neural Networks
Playing Atari with Deep Reinforcement Learning
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Architecture depth, transformers, diffusion, RAG, multimodal alignment, and efficient tuning.
Deep Residual Learning for Image Recognition
Attention Is All You Need
Core skill — read methods first
Open in lab →BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Denoising Diffusion Probabilistic Models
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
Learning Transferable Visual Models From Natural Language Supervision
LoRA: Low-Rank Adaptation of Large Language Models
Large-scale LM behaviors, RL under uncertainty, and production-grade agent/tool stacks.
Mastering the game of Go with deep neural networks and tree search
Language Models are Few-Shot Learners
ReAct: Synergizing Reasoning and Acting in Language Models
Toolformer: Language Models Can Teach Themselves to Use Tools
From research insight to production
Clear hypotheses, solid metrics, and reproducible pipelines are as important in the lab as in shipping. Below are example domains where research-grade ML meets real users, compliance, and scale.
Assembly line — research → AI products
Healthcare & life sciences
Triaging, radiology assistants, and pathway support—always with audit logs, calibration checks, and human-in-the-loop review grounded in published benchmarks.
Research → validation → deployment
Finance & markets
Sequence models and robust ensembles for credit, trading analytics, and anomaly detection—with stress tests and drift monitoring tied to reproducible ablations.
Research → validation → deployment
Education
Adaptive practice, feedback generation, and integrity tooling—built from cited methods, fairness review, and clear metrics instead of opaque black boxes.
Research → validation → deployment
Document intelligence
Layouts, tables, and long-form PDFs into structured data—combining vision encoders and language models with traceable spans for compliance reviews.
Research → validation → deployment
Tax & accounting
Hierarchical labels, entity linking, and jurisdiction-aware rules engines—trained on curated corpora with explicit error analysis on edge cases.
Research → validation → deployment
Legal & compliance
Retrieval over corpora plus grounded generation—citations to source passages, versioned prompts, and evaluation sets that mirror real reviewer workflows.
Research → validation → deployment
Retail & operations
Forecasting stacks and shelf or warehouse vision—closed-loop evaluation on held-out seasons and geos, not just offline accuracy slides.
Research → validation → deployment
Public & civic systems
Transparent scoring and monitoring for services and infrastructure—documentation and bias checks treated as part of the product, not an afterthought.
Research → validation → deployment