The TWIML AI Podcast with Sam Charrington

Cover Image: Stefano Ermon - Podcast Interview

764

The Race to Production-Grade Diffusion LLMs

Cover Image: Siddhant Pardeshi - Podcast Interview

763

Agent Swarms and Knowledge Graphs for Autonomous Software Development

Cover Image: Sebastian Raschka - Podcast Interview

762

AI Trends 2026: OpenClaw Agents, Reasoning LLMs, and More

Cover Image: Yejin Choi - Podcast Interview

761

The Evolution of Reasoning in Small Language Models

Cover Image: Nikita Rudin - Podcast Interview

760

Intelligent Robots in 2026: Are We There Yet?

Cover Image: Aakanksha Chowdhery - Podcast Interview

759

Rethinking Pre-Training for Agentic AI

Cover Image: Munawar Hayat - Podcast Interview

758

Why Vision Language Models Ignore What They See

Cover Image: Zain Asgar - Podcast Interview

757

Scaling Agentic Inference Across Heterogeneous Compute

Cover Image: Devi Parikh - Podcast Interview

756

Proactive Agents for the Web

Cover Image: Robin Braun, Luke Norris - Podcast Interview

755

AI Orchestration for Smart Cities and the Enterprise

Cover Image: Carina Hong - Podcast Interview

754

Building an AI Mathematician

Cover Image: Hung Bui - Podcast Interview

753

High-Efficiency Diffusion Models for On-Device Image Generation and Editing

Cover Image: Alexandre Pesant - Podcast Interview

752

Vibe Coding's Uncanny Valley

Cover Image: Kunle Olukotun - Podcast Interview

751

Dataflow Computing for AI Inference

Cover Image: Jacob Buckman - Podcast Interview

750

Recurrence and Attention for Long-Context Transformers

Cover Image: Illia Polosukhin - Podcast Interview

749

The Decentralized Future of Private AI

Cover Image: Oliver Wang - Podcast Interview

748

Inside Nano Banana 🍌 and the Future of Vision-Language Models

Cover Image: Aditi Raghunathan - Podcast Interview

747

Is It Time to Rethink LLM Pre-Training?

Cover Image: Animesh Koratana - Podcast Interview

746

Building an Immune System for AI Generated Software

Cover Image: Christian Szegedy - Podcast Interview

745

Autoformalization and Verifiable Superintelligence

Cover Image: Prince Canuma - Podcast Interview

744

Multimodal AI Models on Apple Silicon with MLX

Cover Image: Jack Parker-Holder, Shlomi Fruchter - Podcast Interview

743

Genie 3: A New Frontier for World Models

Cover Image: Lin Qiao - Podcast Interview

742

Closing the Loop Between AI Training and Inference

Cover Image: Filip Kozera - Podcast Interview

741

Context Engineering for Productive AI Agents

Cover Image: Jared Quincy Davis - Podcast Interview

740

Infrastructure Scaling and Compound AI Systems

Cover Image: Kwindla Kramer - Podcast Interview

739

Building Voice AI Agents That Don’t Suck

Cover Image: Fatih Porikli - Podcast Interview

738

Distilling Transformers and Diffusion Models for Robust Edge Use Cases

Cover Image: Vijoy Pandey - Podcast Interview

737

Building the Internet of Agents

Cover Image: Ben Wellington - Podcast Interview

736

LLMs for Equities Feature Forecasting at Two Sigma

Cover Image: Jason Corso - Podcast Interview

735

Zero-Shot Auto-Labeling: The End of Annotation for Computer Vision

Cover Image: Charles Martin - Podcast Interview

734

Grokking, Generalization Collapse, and the Dynamics of Training Deep Neural Networks

733

Google I/O 2025 Special Edition

Cover Image: Sebastian Gehrmann - Podcast Interview

732

RAG Risks: Why Retrieval-Augmented LLMs Are Not Safer

Cover Image: Mahesh Sathiamoorthy - Podcast Interview

731

From Prompts to Policies: How RL Builds Better AI Agents

Cover Image: Josh Tobin - Podcast Interview

730

How OpenAI Builds AI Agents That Think and Act

Cover Image: Nidhi Rastogi - Podcast Interview

729

CTIBench: Evaluating LLMs in Cyber Threat Intelligence

Cover Image: Kelly Hong - Podcast Interview

728

Generative Benchmarking

Cover Image: Emmanuel Ameisen - Podcast Interview

727

Exploring the "Biology" of LLMs with Circuit Tracing

Cover Image: Maohao Shen - Podcast Interview

726

Teaching LLMs to Self-Reflect with Reinforcement Learning

Cover Image: Dragomir Anguelov - Podcast Interview

725

Waymo’s Foundation Model for Autonomous Driving

Cover Image: Julie Kallini - Podcast Interview

724

Dynamic Token Merging for Efficient Byte-level Language Models

Cover Image: Jonas Geiping - Podcast Interview

723

Scaling Up Test-Time Compute with Latent Reasoning

Cover Image: Chengzu Li - Podcast Interview

722

Imagine while Reasoning in Space: Multimodal Visualization-of-Thought

Cover Image: Niklas Muennighoff - Podcast Interview

721

Inside s1: An o1-Style Reasoning Model That Cost Under $50 to Train

Cover Image: Ron Diamant - Podcast Interview

720

Accelerating AI Training and Inference with AWS Trainium2

Cover Image: Sergey Levine - Podcast Interview

719

π0: A Foundation Model for Robotics

Cover Image: Victor Dibia - Podcast Interview

718

AI Trends 2025: AI Agents and Multi-Agent Systems

Cover Image: Chris Lott - Podcast Interview

717

Speculative Decoding and Efficient LLM Inference

Cover Image: Patricia Thaine - Podcast Interview

716

Ensuring Privacy for Any LLM

Cover Image: Chip Huyen - Podcast Interview

715

AI Engineering Pitfalls

Cover Image: Abhijit Bose - Podcast Interview

714

Evolving MLOps Platforms for Generative AI and Agents

Cover Image: Daniel Jeffries - Podcast Interview

713

Why Agents Are Stupid & What We Can Do About It

Cover Image: Byron Cook - Podcast Interview

712

Automated Reasoning to Prevent LLM Hallucination

Cover Image: Arash Behboodi - Podcast Interview

711

AI at the Edge: Qualcomm AI Research at NeurIPS 2024

Cover Image: Shirley Wu - Podcast Interview

710

AI for Network Management

Cover Image: Jason Liu - Podcast Interview

709

Why Your RAG System Is Broken, and How to Fix It

Cover Image: Sunil Mallya - Podcast Interview

708

An Agentic Mixture of Experts for DevOps

Cover Image: Scott Stephenson - Podcast Interview

707

Building AI Voice Agents

Cover Image: Tim Rocktäschel - Podcast Interview

706

Is Artificial Superintelligence Imminent?

Cover Image: Lucas García - Podcast Interview

705

ML Models for Safety-Critical Systems

Cover Image: Arvind Narayanan - Podcast Interview

704

AI Agents: Substance or Snake Oil

Cover Image: Shreya Shankar - Podcast Interview

703

AI Agents for Data Analysis

Cover Image: Nicholas Carlini - Podcast Interview

702

Stealing Part of a Production Language Model

Cover Image: Simon Willison - Podcast Interview

701

Supercharging Developer Productivity with ChatGPT and Claude

Cover Image: Shengran Hu - Podcast Interview

700

Automated Design of Agentic Systems

Cover Image: Peter van der Putten - Podcast Interview

699

The EU AI Act and Mitigating Bias in Automated Decisioning

Cover Image: Harrison Chase - Podcast Interview

698

The Building Blocks of Agentic Systems

Cover Image: Siddhika Nevrekar - Podcast Interview

697

Simplifying On-Device AI for Developers

Cover Image: Ashley Edwards - Podcast Interview

696

Genie: Generative Interactive Environments

Cover Image: Marius Memmel - Podcast Interview

695

Bridging the Sim2real Gap in Robotics

Cover Image: Hamel Husain - Podcast Interview

694

Building Real-World LLM Products with Fine-Tuning and More

Cover Image: Albert Gu - Podcast Interview

693

Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI

Cover Image: Amir Bar - Podcast Interview

692

Decoding Animal Behavior to Train Robots with EgoPet

Cover Image: Sarah Bird - Podcast Interview

691

How Microsoft Scales Testing and Safety for Generative AI

Cover Image: Eric Nguyen - Podcast Interview

690

Long Context Language Models and their Biological Applications

Cover Image: Andres Ravinet - Podcast Interview

689

Accelerating Sustainability with AI

688

Gen AI at the Edge: Qualcomm AI Research at CVPR 2024

Cover Image: Sasha Luccioni - Podcast Interview

687

Energy Star Ratings for AI Models

Cover Image: Christopher Manning - Podcast Interview

686

Language Understanding and LLMs

Cover Image: Abdul Fatir Ansari - Podcast Interview

685

Chronos: Learning the Language of Time Series

Cover Image: Joel Hestness - Podcast Interview

684

Powering AI with the World's Largest Computer Chip

Cover Image: Laurent Boinot - Podcast Interview

683

AI for Power & Energy

Cover Image: Azarakhsh Jalalvand - Podcast Interview

682

Controlling Fusion Reactor Instability with Deep Reinforcement Learning

Cover Image: Kirk Marple - Podcast Interview

681

GraphRAG: Knowledge Graphs for AI Applications

Cover Image: Alexander Havrilla - Podcast Interview

680

Teaching Large Language Models to Reason with Reinforcement Learning

Cover Image: Peter Hase - Podcast Interview

679

Localizing and Editing Knowledge in LLMs

678

Coercing LLMs to Do and Reveal (Almost) Anything

Cover Image: Mido Assran - Podcast Interview

677

V-JEPA, AI Reasoning from a Non-Generative Architecture

Cover Image: Sherry Yang - Podcast Interview

676

Video as a Universal Interface for AI Reasoning

Cover Image: Sayash Kapoor - Podcast Interview

675

Assessing the Risks of Open AI Models

Cover Image: Akshita Bhagia - Podcast Interview

674

OLMo: Everything You Need to Train an Open Source LLM

Cover Image: Ben Prystawski - Podcast Interview

673

Training Data Locality and Chain-of-Thought Reasoning in LLMs

Cover Image: Armineh Nourbakhsh - Podcast Interview

672

Reasoning Over Complex Documents with DocLLM

Cover Image: Sanmi Koyejo - Podcast Interview

671

Are Emergent Behaviors in LLMs an Illusion?

Cover Image: Kamyar Azizzadenesheli - Podcast Interview

670

AI Trends 2024: Reinforcement Learning in the Age of LLMs

Cover Image: Ram Sriharsha - Podcast Interview

669

Building and Deploying Real-World RAG Applications

Cover Image: Ben Zhao - Podcast Interview

668

Nightshade: Data Poisoning to Fight Generative AI

Cover Image: Dan Friedman - Podcast Interview

667

Learning Transformer Programs

Cover Image: Thomas G. Dietterich - Podcast Interview

666

AI Trends 2024: Machine Learning & Deep Learning

Cover Image: Naila Murray - Podcast Interview

665

AI Trends 2024: Computer Vision

Cover Image: Ed Anuff - Podcast Interview

664

Are Vector DBs the Future Data Platform for AI?

Cover Image: Markus Nagel - Podcast Interview

663

Quantizing Transformers by Helping Attention Heads Do Nothing

Cover Image: Michael Kearns - Podcast Interview

662

Responsible AI in the Generative Era

Cover Image: Mike Miller - Podcast Interview

661

Edutainment for AI and AWS PartyRock

Cover Image: Cody Coleman - Podcast Interview

660

Data, Systems and ML for Visual Understanding

Cover Image: Kyle Roche - Podcast Interview

659

Patterns and Middleware for LLM Applications

Cover Image: Prem Natarajan - Podcast Interview

658

AI Access and Inclusivity as a Technical Challenge

Cover Image: Jay Emery - Podcast Interview

657

Building LLM-Based Applications with Azure OpenAI

Cover Image: Richard Zhang - Podcast Interview

656

Visual Generative AI Ecosystem Challenges

Cover Image: Heather Gorr - Podcast Interview

655

Deploying Edge and Embedded AI Systems

Cover Image: Yoshua Bengio - Podcast Interview

654

AI Sentience, Agency and Catastrophic Risk

Cover Image: Miriam Friedel - Podcast Interview

653

Delivering AI Systems in Highly Regulated Environments

Cover Image: Riley Goodside - Podcast Interview

652

Mental Models for Advanced ChatGPT Prompting

Cover Image: Sara Hooker - Podcast Interview

651

Multilingual LLMs and the Values Divide in AI

Cover Image: Luke Zettlemoyer - Podcast Interview

650

Scaling Multi-Modal Generative AI

Cover Image: Alex Hanna - Podcast Interview

649

Pushing Back on AI Hype

Cover Image: Nataniel Ruiz - Podcast Interview

648

Personalization for Text-to-Image Generative AI

Cover Image: Shreya Rajpal - Podcast Interview

647

Ensuring LLM Safety for Production Applications

Cover Image: Roland Memisevic - Podcast Interview

646

What’s Next in LLM Reasoning?

Cover Image: James Zou - Podcast Interview

645

Is ChatGPT Getting Worse?

Cover Image: Sophia Sanborn - Podcast Interview

644

Why Deep Networks and Brains Learn Similar Features

Cover Image: Gokul Swamy - Podcast Interview

643

Inverse Reinforcement Learning without RL

Cover Image: Su-In Lee - Podcast Interview

642

Explainable AI for Biology and Medicine

Cover Image: Bayan Bruss - Podcast Interview

641

Transformers On Large-Scale Graphs

Cover Image: Atul Deo - Podcast Interview

640

The Enterprise LLM Landscape

Cover Image: David Rosenberg - Podcast Interview

639

BloombergGPT - an LLM for Finance

Cover Image: Robert Ness - Podcast Interview

638

Are LLMs Good at Causal Reasoning?

Cover Image: Alice Xiang - Podcast Interview

637

Privacy vs Fairness in Computer Vision

Cover Image: Mohit Bansal - Podcast Interview

636

Unifying Vision and Language Models

635

Data Augmentation and Optimized Architectures for Computer Vision

Cover Image: Chris Lattner - Podcast Interview

634

Mojo: A Supercharged Python for AI

Cover Image: Jilei Hou - Podcast Interview

633

Stable Diffusion and LLMs at the Edge

Cover Image: Joon Park - Podcast Interview

632

Modeling Human Behavior with Generative Agents

Cover Image: Hugo Larochelle - Podcast Interview

631

Towards Improved Transfer Learning

630

Language Modeling With State Space Models

Cover Image: Dhruv Batra - Podcast Interview

629

Building Maps and Spatial Awareness in Blind AI Agents

Cover Image: Jerry Liu - Podcast Interview

628

AI Agents and Data Integration with GPT and LLaMa

Cover Image: Christos Louizos - Podcast Interview

627

Hyperparameter Optimization through Neural Network Partitioning

Cover Image: John Bohannon, Marti Hearst - Podcast Interview

626

Are LLMs Overhyped or Underappreciated?

Cover Image: Ben Goertzel - Podcast Interview

625

Are Large Language Models a Path to AGI?

Cover Image: Jeff Boudier - Podcast Interview

624

Open Source Generative AI at Hugging Face

Cover Image: Vinesh Sukumar - Podcast Interview

623

Generative AI at the Edge

Cover Image: Anastasis Germanidis - Podcast Interview

622

Runway Gen-2: Generative AI for Video Creation

Cover Image: Tom Goldstein - Podcast Interview

621

Watermarking Large Language Models to Fight Plagiarism

Cover Image: Anna Ivanova - Podcast Interview

620

Does ChatGPT “Think”? A Cognitive Neuroscience Perspective

Cover Image: Monroe Kennedy III - Podcast Interview

619

Robotic Dexterity and Collaboration

618

Privacy and Security for Stable Diffusion and LLMs

Cover Image: Vinodkumar Prabhakaran - Podcast Interview

617

Understanding AI’s Impact on Social Disparities

616

AI Trends 2023: Causality and the Impact on Large Language Models

Cover Image: Dimitris Zermas - Podcast Interview

615

Data-Centric Zero-Shot Learning for Precision Agriculture

Cover Image: Anima Anandkumar - Podcast Interview

614

How LLMs and Generative AI are Revolutionizing AI for Science

Cover Image: Sameer Singh - Podcast Interview

613

AI Trends 2023: Natural Language Proc – ChatGPT, GPT-4 and Cutting Edge Research with Sameer Singh

612

AI Trends 2023: Reinforcement Learning - RLHF, Robotic Pre-Training, and Offline RL

Cover Image: Catherine Nakalembe - Podcast Interview

611

Supporting Food Security in Africa Using ML

610

Service Cards and ML Governance

Cover Image: Tony Jebara - Podcast Interview

609

Reinforcement Learning for Personalization at Spotify

Cover Image: ChatGPT - Podcast Interview

608

Will ChatGPT take my job?

Cover Image: Kumar Chellapilla - Podcast Interview

607

Geospatial Machine Learning at AWS

Cover Image: Disha Singla - Podcast Interview

606

Real-Time ML Workflows at Capital One

Cover Image: Johann Brehmer - Podcast Interview

605

Weakly Supervised Causal Representation Learning

Cover Image: Emad Mostaque - Podcast Interview

604

Stable Diffusion & Generative AI

603

Exploring Large Language Models

Cover Image: Jeff Clune - Podcast Interview

602

Accelerating Intelligence with AI-Generating Algorithms

Cover Image: Cedric Cocaud - Podcast Interview

601

Programmatic Labeling and Weak Supervision for Autonomous Commercial Aviation

Cover Image: Heather Nolis - Podcast Interview

600

Engineering Production NLP Systems at T-Mobile

Cover Image: Ken Goldberg - Podcast Interview

599

Tesla’s Optimus and Sim2Real for Robotic AI

Cover Image: Oren Etzioni, John Bohannon - Podcast Interview

598

The Evolution of the NLP Landscape

597

The Great MLOps Debate: End-to-End ML Platforms vs Specialized Tools

Cover Image: Jacopo Tagliabue - Podcast Interview

596

You're not Facebook: Architecting MLOps for B2B Use Cases

Cover Image: Ali Rodell - Podcast Interview

595

Building Foundational ML Platforms with Kubernetes and Kubeflow

Cover Image: Vasi Philomin - Podcast Interview

594

AI-Powered Peer Programming

Cover Image: Vidyut Naware - Podcast Interview

593

Applied AI/ML Research at PayPal

Cover Image: Wendy Foster - Podcast Interview

592

Assessing Data Quality at Shopify

591

Transformers for Tabular Data at Capital One

Cover Image: Orit Peleg - Podcast Interview

590

Understanding Collective Insect Communication with ML

Cover Image: John Bohannon, Douwe Kiela - Podcast Interview

589

Multimodal, Multi-Lingual NLP at Hugging Face

Cover Image: Bill Vass - Podcast Interview

588

Synthetic Data Generation for Robotics

Cover Image: Jeff Gehlhaar - Podcast Interview

587

Multi-Device, Multi-Use-Case Optimization

Cover Image: Sharad Goel - Podcast Interview

586

Causal Conceptions of Fairness and their Consequences

Cover Image: Melika Payvand - Podcast Interview

585

Brain-Inspired Hardware and Algorithm Co-Design

Overcoming Oscillations in Quantization-Aware Training, Variational On-the-Fly Personalization, and CITRIS: Causal Identifiability from Temporal Intervened Sequences." data-search-guests="Arash Behboodi">

584

Equivariant Priors for Compressed Sensing

Cover Image: Audrey Smith - Podcast Interview

583

Managing Data Labeling Ops for Success

Cover Image: Richard Socher - Podcast Interview

582

Engineering A ML-Powered Developer-First Search Engine

Forecasting from LiDAR via Future Object Detection, which proposes an end-to-end approach for detection and motion forecasting based on raw sensor measurement as opposed to ground truth tracks. Finally, we discuss Aljosa’s third and final paper Opening up Open-World Tracking, which proposes a new benchmark to analyze existing efforts in multi-object tracking and constructs a baseline for these tasks." data-search-guests="Aljosa Osep">

Cover Image: Aljosa Osep - Podcast Interview

581

On The Path Towards Robot Vision

Unsupervised Domain Generalization by learning a Bridge Across Domains." data-search-guests="Kate Saenko">

Cover Image: Kate Saenko - Podcast Interview

580

More Language, Less Labeling

Imposing Consistency for Optical Flow Estimation, a paper that introduces novel and effective consistency strategies for optical flow estimation. The final paper we discuss is IRISformer: Dense Vision Transformers for Single-Image Inverse Rendering in Indoor Scenes, which proposes a transformer architecture to simultaneously estimate depths, normals, spatially-varying albedo, roughness, and lighting from a single image of an indoor scene. For each paper, we explore the motivations and challenges and get concrete examples to demonstrate each problem and solution presented." data-search-guests="Fatih Porikli">

579

Optical Flow Estimation, Panoptic Segmentation, and Vision Transformers

Cover Image: Adam Wood - Podcast Interview

578

Data Governance for Data Science

Cover Image: Mike Del Balso - Podcast Interview

577

Feature Platforms for Data-Centric AI

Cover Image: Shayan Mohanty - Podcast Interview

576

Machine Teaching for Machine Learning

Cover Image: Adrien Gaidon - Podcast Interview

575

Principal-Centric AI

https://twimlai.com/podcast/twimlai/series/data-centric-ai." data-search-guests="D. Sculley">

Cover Image: D. Sculley - Podcast Interview

574

Data Debt in Machine Learning

Cover Image: Rob Walker - Podcast Interview

573

AI for Enterprise Decisioning at Scale

Cover Image: Margaret Mitchell - Podcast Interview

572

Data Rights, Quantification and Governance for Ethical AI

Cover Image: Been Kim - Podcast Interview

571

Studying Machine Intelligence

Cover Image: Auke Wiggers - Podcast Interview

570

Transformer-Based Transform Coding

Cover Image: Irwan Bello - Podcast Interview

569

Mixture-of-Experts and Trends in Large-Scale Language Modeling

Cover Image: Timnit Gebru - Podcast Interview

568

Daring to DAIR: Distributed AI Research

Cover Image: Doina Precup - Podcast Interview

567

Hierarchical and Continual RL

Cover Image: Bharath Ramsundar - Podcast Interview

566

Open-Source Drug Discovery with DeepChem

565

Hands-On Machine Learning Education

Cover Image: Thomas Wolf - Podcast Interview

564

Big Science and Embodied Learning at Hugging Face 🤗

Cover Image: Murali Akula - Podcast Interview

563

Full-Stack AI Systems Development

Cover Image: Subutai Ahmad - Podcast Interview

562

100x Improvements in Deep Learning Performance with Sparsity

Cover Image: Jennifer Glore - Podcast Interview

561

Scaling BERT and GPT for Financial Services

560

AI Rewind 2021: Trends in Reinforcement Learning

Cover Image: Rishabh Agarwal - Podcast Interview

559

Deep Reinforcement Learning at the Edge of the Statistical Precipice

Cover Image: Rafael Gomez-Bombarelli - Podcast Interview

558

Designing New Energy Materials with Machine Learning

Cover Image: Patrick Heimbach - Podcast Interview

557

Differentiable Programming for Oceanography

Cover Image: Zachary Lipton - Podcast Interview

556

AI Rewind 2021: Trends in Machine Learning & Deep Learning

Cover Image: Jonathan Le Roux - Podcast Interview

555

Solving the Cocktail Party Problem with Machine Learning

Cover Image: Karianne Bergen - Podcast Interview

554

Machine Learning for Earthquake Seismology

Cover Image: Arun Kumar - Podcast Interview

553

The New DBfication of ML/AI

Cover Image: Meredith Broussard - Podcast Interview

552

Building Public Interest Technology

Cover Image: Sebastian Bubeck - Podcast Interview

551

A Universal Law of Robustness via Isoperimetry

Cover Image: John Bohannon - Podcast Interview

550

AI Rewind 2021: Trends in NLP

Cover Image: Georgia Gkioxari - Podcast Interview

549

AI Rewind 2021: Trends in Computer Vision

Cover Image: Alison Gopnik - Podcast Interview

548

Kids Run the Darndest Experiments: Causal Learning in Children and Implications for AI

Cover Image: Tina Eliassi-Rad - Podcast Interview

547

Hypergraphs, Simplicial Complexes and Graph Representations of Complex Systems

Cover Image: Oriol Vinyals - Podcast Interview

546

Deep Learning, Transformers, and the Consequences of Scale

Cover Image: Michael McCourt - Podcast Interview

545

Optimization, Machine Learning and Intelligent Experimentation

Cover Image: Brian Granger - Podcast Interview

544

Jupyter and the Evolution of ML Tooling

Cover Image: Jack Berkowitz - Podcast Interview

543

Creating a Data-Driven Culture at ADP

Cover Image: Bratin Saha - Podcast Interview

542

re:Invent Roundup 2021

Cover Image: Doug Burdick - Podcast Interview

541

Multi-modal Deep Learning for Complex Document Understanding

Cover Image: Shayan Mortazavi - Podcast Interview

540

Predictive Maintenance Using Deep Learning and Reliability Engineering

Cover Image: Nasrin Mostafazadeh - Podcast Interview

539

Building a Deep Tech Startup in NLP

Cover Image: Julie Shah - Podcast Interview

538

Models for Human-Robot Collaboration

Cover Image: Yunyao Li - Podcast Interview

537

Four Key Tools for Robust Enterprise NLP

Cover Image: Kim Branson - Podcast Interview

536

Machine Learning at GSK

World Models and Attention for Reinforcement Learning, and The Sensory Neuron as a Transformer: Permutation-Invariant Neural Networks for Reinforcement Learning. This interview is Nerd Alert certified, so get your notes ready! PS. David is one of our favorite follows on Twitter (@hardmaru), so check him out and share your thoughts on this interview and his work! " data-search-guests="David Ha">

Cover Image: David Ha - Podcast Interview

535

The Benefit of Bottlenecks in Evolving Artificial Intelligence

Cover Image: Luke Stark - Podcast Interview

534

Facebook Abandons Facial Recognition. Should Everyone Else Follow Suit?

Cover Image: Francesc Riera - Podcast Interview

533

Building Blocks of Machine Learning at LEGO

532

Exploring the FastAI Tooling Ecosystem

Cover Image: Julianna Ianni - Podcast Interview

531

Multi-task Learning for Melanoma Detection

Cover Image: Akshat Kaul - Podcast Interview

530

House Hunters: Machine Learning at Redfin

Cover Image: Edward Raff - Podcast Interview

529

Attacking Malware with Adversarial Machine Learning

Cover Image: Andrea Banino - Podcast Interview

528

Learning to Ponder: Memory in Deep Neural Networks

527

Advancing Deep Reinforcement Learning with NetHack

Cover Image: Prashanth Chandrasekar - Podcast Interview

526

Building Technical Communities at Stack Overflow

WiCluster: Passive Indoor 2D/3D Positioning using WiFi without Precise Labels, explores the use of rf signals to infer what environment looks like, allowing for estimation of a person's movement. We also discuss the ability for machine learning and AI to help enable 5G and make it more efficient for these applications, as well as the scenarios that ML would allow for more effective delivery of connected services, and look towards what might be possible in the near future. " data-search-guests="Joseph Soriaga">

Cover Image: Joseph Soriaga - Podcast Interview

525

Deep Learning is Eating 5G. Here’s How.

Cover Image: Kanaka Rajan - Podcast Interview

524

Modeling Human Cognition with RNNs and Curriculum Learning

conversations with Ville, we explored his experience building and deploying the open-source framework, Metaflow, while working at Netflix. Since our last chat, Ville has embarked on a few new journeys, including writing the upcoming book Effective Data Science Infrastructure, and commercializing Metaflow, both of which we dig into quite a bit in this conversation. We reintroduce the problem that Metaflow was built to solve and discuss some of the unique use cases that Ville has seen since its release, the relationship between Metaflow and Kubernetes, and the maturity of services like batch and lambdas allowing a complete production ML system to be delivered. Finally, we discuss the degree to which Ville is catering is Outerbounds' efforts to building tools for the MLOps community, and what the future looks like for him and Metaflow. " data-search-guests="Ville Tuulos">

Cover Image: Ville Tuulos - Podcast Interview

523

Do You Dare Run Your ML Experiments in Production?

Cover Image: Li Jiang - Podcast Interview

522

Delivering Neural Speech Services at Scale

Cover Image: Sandra Wachter - Podcast Interview

521

AI’s Legal and Ethical Implications

Cover Image: Dillon Erb - Podcast Interview

520

Compositional ML and the Future of Software Development

Cover Image: Yanshuai Cao - Podcast Interview

519

Generating SQL Database Queries from Natural Language

518

Social Commonsense Reasoning

Cover Image: Konrad Tollmar - Podcast Interview

517

Deep Reinforcement Learning for Game Testing at EA

Cover Image: Kai-Fu Lee - Podcast Interview

516

Exploring AI 2041

Cover Image: Daniela Rus - Podcast Interview

515

Advancing Robotic Brains and Bodies

Jesse Engel. " data-search-guests="Alexander Richard">

Cover Image: Alexander Richard - Podcast Interview

514

Neural Synthesis of Binaural Speech From Mono Audio

Cover Image: Alona Fyshe - Podcast Interview

513

Using Brain Imaging to Improve Neural Networks

Cover Image: Samory Kpotufe - Podcast Interview

512

Adaptivity in Machine Learning

Milind Tambe, as well as a project focused on using ML techniques to assist in the identification of people in need of housing resources, and ensuring that they get the best interventions possible. If you enjoyed this conversation, I encourage you to check out our conversation with Milind Tambe from last year's TWIMLfest on Why AI Innovation and Social Impact Go Hand in Hand. " data-search-guests="Eric Rice">

Cover Image: Eric Rice - Podcast Interview

511

A Social Scientist’s Perspective on AI

Cover Image: José Miguel Hernández Lobato - Podcast Interview

510

Applications of Variational Autoencoders and Bayesian Optimization

Cover Image: Greg Brockman - Podcast Interview

509

Codex, OpenAI’s Automated Code Generation API

Cover Image: Rose Yu - Podcast Interview

508

Spatiotemporal Data Analysis

Cover Image: Bryan Catanzaro - Podcast Interview

507

Parallelism and Acceleration for Large Language Models

Cover Image: Lina Montoya - Podcast Interview

506

Applying the Causal Roadmap to Optimal Dynamic Treatment Rules

Cover Image: Gustavo Malkomes - Podcast Interview

505

Constraint Active Search for Human-in-the-Loop Optimization

Cover Image: Virginia Smith - Podcast Interview

504

Fairness and Robustness in Federated Learning

Cover Image: Errol Koolmeister - Podcast Interview

503

Scaling AI at H&M Group

Cover Image: Stefano Soatto - Podcast Interview

502

Evolving AI Systems Gracefully

Cover Image: Suchi Saria - Podcast Interview

501

ML Innovation in Healthcare

500

Cross-Device AI Acceleration, Compilation & Execution

Cover Image: Dan Bohus, Siddhartha Sen - Podcast Interview

499

The Future of Human-Machine Interaction

Cover Image: Julieta Martinez - Podcast Interview

498

Vector Quantization for NN Compression

Cover Image: Claire Monteleoni - Podcast Interview

497

Deep Unsupervised Learning for Climate Informatics

Cover Image: Amir Habibian - Podcast Interview

496

Skip-Convolutions for Efficient Video Processing

Cover Image: Noam Slonim - Podcast Interview

495

Advancing NLP with Project Debater

Cover Image: Madhur Behl - Podcast Interview

494

Bringing AI Up to Speed with Autonomous Racing

Cover Image: Eric Horvitz - Podcast Interview

493

AI and Society: Past, Present and Future

Cover Image: Parvez Ahammad - Podcast Interview

492

Agile Applied AI Research

Cover Image: Katherine Kuchenbecker - Podcast Interview

491

Haptic Intelligence

which you can catch the videos for here). " data-search-guests="Chris Fregly, Antje Barth">

Cover Image: Chris Fregly, Antje Barth - Podcast Interview

490

Data Science on AWS

Cover Image: Ziad Asghar - Podcast Interview

489

Accelerating Distributed AI Applications at Qualcomm

Cover Image: Nir Bar-Lev - Podcast Interview

488

Buy AND Build for Production Machine Learning

Cover Image: Alex Smola - Podcast Interview

487

Applied AI Research at AWS

Cover Image: Sean Taylor - Podcast Interview

486

Causal Models in Practice at Lyft

In our conversation with Jabran, we explore their recent endeavor into the complete mapping of which T-cells bind to which antigens through the Antigen Map Project. We discuss how Jabran's background in astrophysics and cosmology has translated to his current work in immunology and biology, the origins of the antigen map, a walkthrough of the biological aspect of the project, and how the focus was changed by the emergence of the coronavirus pandemic. We talk through the biological advancements, and the challenges of using machine learning in this setting, some of the more advanced ML techniques that they've tried that have not panned out (as of yet), the path forward for the antigen map to make a broader impact, and much more. " data-search-guests="Jabran Zahid">

Cover Image: Jabran Zahid - Podcast Interview

485

Using AI to Map the Human Immune System

Cover Image: Thorben Konstantin Rusch - Podcast Interview

484

Learning Long-Time Dependencies with RNNs

Cover Image: Allyson Ettinger - Podcast Interview

483

What the Human Brain Can Tell Us About NLP Models

Cover Image: Roberto Bondesan - Podcast Interview

482

Probabilistic Numeric CNNs

Cover Image: Huiji Gao - Podcast Interview

481

Building a Unified NLP Framework at LinkedIn

Advancing Your Data Science Career During the Pandemic panel, where she shared her experience trying to navigate the suddenly hectic data science job market. Now, a year removed from that panel, we explore her book on data science careers, top insights for folks just getting into the field, ways that job seekers should be signaling that they have the required background, and how to approach and navigate failure as a data scientist. We also spend quite a bit of time discussing Dask, an open-source library for parallel computing in Python, as well as use cases for the tool, the relationship between Dask and Kubernetes/Docker containers, where data scientists are in regards to the software development toolchain and much more! " data-search-guests="Jacqueline Nolis">

Cover Image: Jacqueline Nolis - Podcast Interview

480

Dask + Data Science Careers

Cover Image: Irene Chen - Podcast Interview

479

Machine Learning for Equitable Healthcare Outcomes

Cover Image: Mark Riedl - Podcast Interview

478

AI Storytelling Systems

Cover Image: Jamie Macbeth - Podcast Interview

477

Creating Robust Language Representations

Cover Image: Pieter Abbeel - Podcast Interview

476

Reinforcement Learning for Industrial AI

Cover Image: Abhishek Thakur - Podcast Interview

475

AutoML for Natural Language Processing

Cover Image: Saqib Shaikh - Podcast Interview

474

Inclusive Design for Seeing AI

Cover Image: Jelani Nelson - Podcast Interview

473

Theory of Computation

Cover Image: Stevie Chancellor - Podcast Interview

472

Human-Centered ML for High-Risk Behaviors

Cover Image: Conor Jensen - Podcast Interview

471

Operationalizing AI at Dataiku

Cover Image: Diego Oppenheimer - Podcast Interview

470

ML Lifecycle Management at Algorithmia

Cover Image: Santiago Anduaga - Podcast Interview

469

End to End ML at Cloudera

Cover Image: Paul Van der Boor - Podcast Interview

468

ML Platforms for Global Scale

Cover Image: Emily Bender, Margaret Mitchell - Podcast Interview

467

Can Language Models Be Too Big?

Cover Image: Abhishek Gupta - Podcast Interview

466

Applying RL to Real-World Robotics

Cover Image: David Carmona - Podcast Interview

465

Accelerating Innovation with AI at Scale

Today we're joined by Melanie Mitchell, Davis Professor at the Santa Fe Institute and author of Artificial Intelligence: A Guide for Thinking Humans. While Melanie has had a long career with a myriad of research interests, we focus on a few, complex systems and the understanding of intelligence, complexity, and her recent work on getting AI systems to make analogies. We explore examples of social learning, and how it applies to AI contextually, and defining intelligence. We discuss potential frameworks that would help machines understand analogies, established benchmarks for analogy, and if there is a social learning solution to help machines figure out analogy. Finally, we talk through the overall state of AI systems, the progress we've made amid the limited concept of social learning, if we're able to achieve intelligence with current approaches to AI, and much more! " data-search-guests="Melanie Mitchell">

Cover Image: Melanie Mitchell - Podcast Interview

464

Complexity and Intelligence

Cover Image: Adriana Kovashka - Podcast Interview

463

Robust Visual Reasoning

Cover Image: Nishan Subedi - Podcast Interview

462

Architectural and Organizational Patterns in Machine Learning

Cover Image: Vered Shwartz - Podcast Interview

461

Common Sense Reasoning in NLP

Cover Image: Ayanna Howard - Podcast Interview

460

How to Be Human in the Age of AI

Cover Image: Penousal Machado - Podcast Interview

459

Evolution and Intelligence

Cover Image: Arul Menezes - Podcast Interview

458

Innovating Neural Machine Translation

Cover Image: Xin Luna Dong - Podcast Interview

457

Building the Product Knowledge Graph at Amazon

Cover Image: Sarah Brown - Podcast Interview

456

Towards a Systems-Level Approach to Fair ML

Cover Image: Andrew Trister - Podcast Interview

455

AI for Digital Health Innovation

454

System Design for Autonomous Vehicles

453

Building, Adopting, and Maturing LinkedIn’s Machine Learning Platform

Cover Image: Jesse Engel - Podcast Interview

452

Expressive Deep Learning with Magenta DDSP

Cover Image: Francisco Webber - Podcast Interview

451

Semantic Folding for Natural Language Understanding

Cover Image: Gurdeep Pall - Podcast Interview

450

The Future of Autonomous Systems

Cover Image: Bryan Carstens - Podcast Interview

449

AI for Ecology and Ecosystem Preservation

Cover Image: Jason Gauci - Podcast Interview

448

Off-Line, Off-Policy RL for Real-World Decision Making at Facebook

Cover Image: Saiph Savage - Podcast Interview

447

A Future of Work for the Invisible Workers in AI

@samcharrington or @twimlai. To follow along with the 2020 AI Rewind Series, head over to the series page. " data-search-guests="Michael Bronstein">

Cover Image: Michael Bronstein - Podcast Interview

446

AI Rewind 2020: Trends in Graph Machine Learning

@samcharrington or @twimlai. To follow along with the 2020 AI Rewind Series, head over to the series page! " data-search-guests="Sameer Singh">

445

AI Rewind 2020: Trends in Natural Language Processing

@samcharrington or @twimlai. To follow along with the 2020 AI Rewind Series, head over to the series page! " data-search-guests="Pavan Turaga">

Cover Image: Pavan Turaga - Podcast Interview

444

AI Rewind 2020: Trends in Computer Vision

@samcharrington or @twimlai. To follow along with the 2020 AI Rewind Series, head over to the series page! " data-search-guests="Pablo Samuel Castro">

Cover Image: Pablo Samuel Castro - Podcast Interview

443

AI Rewind 2020: Trends in Reinforcement Learning

Cover Image: Aravind Rajeswaran - Podcast Interview

442

MOReL: Model-Based Offline Reinforcement Learning

Cover Image: Charles Isbell - Podcast Interview

441

Machine Learning as a Software Engineering Enterprise

Cover Image: Taco Cohen - Podcast Interview

440

Natural Graph Networks

Cover Image: Edgar Rodríguez - Podcast Interview

439

Productionizing Time-Series Workloads at Siemens Energy

Cover Image: Srivathsan Canchi - Podcast Interview

438

ML Feature Store at Intuit

Cover Image: Swami Sivasubramanian - Podcast Interview

437

re:Invent Roundup 2020

Cover Image: Subarna Sinha - Podcast Interview

436

Predictive Disease Risk Modeling at 23andMe

Cover Image: Daan Odijk - Podcast Interview

435

Scaling Video AI at RTL

Cover Image: Peter Mattson - Podcast Interview

434

Benchmarking Machine Learning with MLCommons

Cover Image: Charlene Chambliss - Podcast Interview

433

Deep Learning for NLP: From the Trenches

Cover Image: Maxime Beauchenim, Willem Pienaar, Kevin Stumpf - Podcast Interview

432

Feature Stores for Accelerating AI Development

Cover Image: Meredith Broussard, Shalini Kantayya, Deb Raji - Podcast Interview

431

An Exploration of Coded Bias

Cover Image: Dileep George - Podcast Interview

430

Common Sense as an Algorithmic Framework

Cover Image: Sushil Thomas - Podcast Interview

429

Scaling Enterprise ML in 2020: Still Hard

Cover Image: Devin Singh - Podcast Interview

428

Enabling Clinical Automation: From Research to Deployment

427

Pixels to Concepts with Backpropagation

Cover Image: Jon Wang - Podcast Interview

426

Fighting Global Health Disparities with AI

here! " data-search-guests="Sina Bahram, Cynthia Bennet, Chancey Fleet, Venkatesh Potluri, Meredith Ringel Morris">

425

Accessibility and Computer Vision

Cover Image: Frank Zhao - Podcast Interview

424

NLP for Equity Investing

Cover Image: Sal Khan - Podcast Interview

423

The Future of Education and AI

Cover Image: Milind Tambe - Podcast Interview

422

Why AI Innovation and Social Impact Go Hand in Hand

here. " data-search-guests="Jeremy Howard">

Cover Image: Jeremy Howard - Podcast Interview

421

What’s Next for Fast.ai?

420

Feature Stores for MLOps

Cover Image: Suzana Ilić - Podcast Interview

419

Exploring Causality and Community

Cover Image: Shakir Mohamed - Podcast Interview

418

Decolonizing AI

Cover Image: Adina Trufinescu - Podcast Interview

417

Spatial Analysis for Real-Time Video Processing

Cover Image: Cha Zhang - Podcast Interview

416

How Deep Learning has Revolutionized OCR

415

Machine Learning for Food Delivery at Global Scale

Cover Image: Jeff Gehlhaar, Zahra Koochak - Podcast Interview

414

Open Source at Qualcomm AI Research

Visualizing The Consequences Of Climate Change Using Cycle-consistent Adversarial Networks,' and we're excited to pick her brain about the ways ML is currently being leveraged to help the environment. In our conversation, we explore the use of GANs to visualize the consequences of climate change, the evolution of different approaches she used, and the challenges of training GANs using an end-to-end pipeline. Finally, we talk through Sasha's goals for the aforementioned panel, which is scheduled for Friday, October 23rd at 1 pm PT. Register for all of the great TWIMLfest sessions here! " data-search-guests="Sasha Luccioni">

413

Visualizing Climate Impact with GANs

Cover Image: Burr Settles - Podcast Interview

412

ML for Language Learning at Duolingo

Cover Image: Artur Yakimovich - Podcast Interview

411

Bridging The Gap Between Machine Learning and the Life Sciences

Cover Image: Kavita Bala - Podcast Interview

410

Understanding Cultural Style Trends with Computer Vision

Cover Image: Nikos Athanasiou, Muhammed Kocabas, Michael Black - Podcast Interview

409

That’s a VIBE: ML for Human Pose and Shape Estimation

408

3D Deep Learning with PyTorch 3D

Cover Image: Michael I. Jordan - Podcast Interview

407

What are the Implications of Algorithmic Thinking?

406

Beyond Accuracy: Behavioral Testing of NLP Models

Cover Image: Gary Ren - Podcast Interview

405

How Machine Learning Powers On-Demand Logistics at Doordash

404

Machine Learning as a Software Engineering Discipline

Cover Image: Dawn Song - Podcast Interview

403

AI and the Responsible Data Economy

Cover Image: Wilka Carvalho - Podcast Interview

402

Relational, Object-Centric Agents for Completing Simulated Household Tasks

401

TWIML Model Explainability Forum

Cover Image: Johannes Eichstaedt - Podcast Interview

400

What NLP Tells Us About COVID-19 and Mental Health

399

Human-AI Collaboration for Creativity

Cover Image: Max Welling - Podcast Interview

398

Neural Augmentation for Wireless Communication

Cover Image: Iordanis Kerenidis - Podcast Interview

397

Quantum Machine Learning: The Next Frontier?

Cover Image: Elaine Nsosie - Podcast Interview

396

Machine Learning & Epidemiology

Cover Image: Hal Daumé III - Podcast Interview

395

Language (Technology) Is Power: Exploring the Inherent Complexity of NLP Systems

394

Graph ML Research at Twitter with Michael Bronstein

IBM Data Science Community site, which has over 10,000 members, they provide a place for data scientists to collaborate, share knowledge, and support one another. IBM's Data Science Community site is a great place to connect with other data scientists and to find information and resources to support your career. Join and get a free month of select IBM Programs on Coursera. " data-search-guests="Chris Nuernberger, Huda Nassar, Burak Kanber, Catherine Nelson, Gabriela de Queiroz, Avi Bryant, Chris Lattner">

393

Panel: The Great Language (Un)Debate

Cover Image: Eric Topol - Podcast Interview

392

What the Data Tells Us About COVID-19

Cover Image: Diana Marculescu - Podcast Interview

391

The Case for Hardware-ML Model Co-Design

Cover Image: Flora Tasse - Podcast Interview

390

Computer Vision for Remote AR

Cover Image: Julien Quiroga - Podcast Interview

389

Deep Learning for Automatic Basketball Video Production

Cover Image: Deborah Raji - Podcast Interview

388

How External Auditing Is Changing the Facial Recognition Landscape

Cover Image: Hima Lakkaraju - Podcast Interview

387

AI for High-Stakes Decision-Making

386

Invariance, Geometry, and Deep Neural Networks

Cover Image: Babak Ehteshami Bejnordi - Podcast Interview

385

Channel Gating for Cheaper and More Accurate Neural Nets

Cover Image: Marsal Gavalda - Podcast Interview

384

Machine Learning for Commerce at Square

Cover Image: Jianxu Chen - Podcast Interview

383

Cell Exploration with ML at the Allen Institute

Cover Image: Andreas Madsen - Podcast Interview

382

Neural Arithmetic Units & Experiences as an Independent ML Researcher

YouTube channel! " data-search-guests="Rumman Chowdhury">

Cover Image: Rumman Chowdhury - Podcast Interview

381

2020: A Critical Inflection Point for Responsible AI

380

Advancing Your Data Science Career During the Pandemic

Cover Image: Andreas Tolias - Podcast Interview

379

Engineering a Less Artificial Intelligence

Cover Image: Joseph Gonzalez - Podcast Interview

378

Rethinking Model Size: Train Large, Then Compress

Cover Image: Alpha Lee - Podcast Interview

377

The Physics of Data

Cover Image: Emily Bender - Podcast Interview

376

Is Linguistics Missing from NLP Research?

375

Disrupting DeepFakes: Adversarial Attacks Against Conditional Image Translation Networks

Cover Image: Sherri Rose - Podcast Interview

374

Understanding the COVID-19 Data Quality Problem

wandb.com/twiml. " data-search-guests="Lukas Biewald">

Cover Image: Lukas Biewald - Podcast Interview

373

The Whys and Hows of Managing Machine Learning Artifacts

372

Language Modeling and Protein Generation at Salesforce

Cover Image: Manuela Veloso - Podcast Interview

371

AI Research at JPMorgan Chase

Twitter for updates.

Big shout out to IBM for their support in helping to make this panel possible! IBM continues to support major initiatives -- applying data, knowledge, computing power and insights, to solve the challenging problems presented by the coronavirus. Some of these initiatives include their work with the High-Performance Computing Consortium, providing detailed virus tracking information on the Weather Channel, and offering free access to Watson Assistant for COVID-19 related applications. Click here to find out more about IBM’s response." data-search-guests="Rex Douglass, Robert Munro, Lea Shanley, Gigi Yuen-Reed">

370

Responsible Data Science in the Fight Against COVID-19 (Coronavirus)

Cover Image: Aleksandar Madry - Podcast Interview

369

Adversarial Examples Are Not Bugs, They Are Features

Cover Image: Benjamin Green - Podcast Interview

368

AI for Social Good: Why 'Good' Isn't Good Enough

Cover Image: Risto Miikkulainen - Podcast Interview

367

The Evolution of Evolutionary Algorithms

Cover Image: Quoc Le - Podcast Interview

366

Neural Architecture Search and Google’s New AutoML Zero

Cover Image: Karim Beguir - Podcast Interview

365

Automating Electronic Circuit Design with Deep RL

Cover Image: David Duvenaud - Podcast Interview

364

Neural Ordinary Differential Equations

The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning," which identifies three formal definitions of fairness in algorithms, the statistical limitations of each, and details how mathematical formalizations of fairness could be introduced into algorithms. " data-search-guests="Sharad Goel">

363

The Measure and Mismeasure of Fairness

Cover Image: Cathy Wu - Podcast Interview

362

Simulating the Future of Traffic with RL

361

Consciousness and COVID-19

360

Geometry-Aware Neural Rendering

359

The Third Wave of Robotic Learning

Cover Image: Stefan Lee - Podcast Interview

358

Learning Visiolinguistic Representations with ViLBERT

Cover Image: Jürgen Schmidhuber - Podcast Interview

357

Upside Down Reinforcement Learning

here. " data-search-guests="Beidi Chen">

Cover Image: Beidi Chen - Podcast Interview

356

SLIDE: Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning

355

Advancements in Reinforcement Learning

Ultrasound Nerve Segmentation competition. His secret sauce was a ground-up implementation of the U-net architecture (an encoder-decoder network), which hadn't been used in a Kaggle competition. To "preserve the localization information from the original images...I implemented [the U-net architecture]. I trained the network, and to my surprise, it worked." He was not the only one to use a U-net in this competition, but he thinks his competitive edge came from being one of the only ones to learn how to build it from scratch. "If you did something different or you implemented your own and improved it a little bit, maybe more than what everybody else was using, you had a chance of doing a little better…" David's solution for this competition also benefited from a lot of experimenting with data augmentation strategies. "When you train segmentation networks to avoid overfitting, you have to augment the image and the mask together and you have to find the right kind of augmentation strategies...so I hacked those and created a good augmentation strategy [and] framework." David ultimately finished second out of 950 teams in the competition that year. Not bad for his second effort! ProVision Body Scanners and Architectures for 3-Dimensional Data Encouraged, David continued to participate in Kaggle. Following his early success was a competition focused on classifying images from airport body scanners (a.k.a. Nude-O-Scopes as Sam calls them), and was sponsored by the Department of Homeland Security. The goal was to create new algorithms that could more accurately predict threats and detect prohibited items during the screening process. It was the largest Kaggle competition in terms of prize money ($1.5 million) and also in terms of the size of the data set being used. The Passenger Screening Algorithm Challenge was particularly interesting to David in its use of three-dimensional data. There were no existing best practices for how to build architectures that could process 3D data without downsizing or downsampling. Three-dimensional images require much more memory and storage to process than 2D images, but also create new opportunities. The third dimension provides a "third axis where you also can correlate features across multiple two-dimensional images because the volume is essentially a stack of two-dimensional images." David was still in the middle of his PhD research and had already been thinking about three-dimensional data for CT or MRI images. His entry for the competition would apply the same architecture that he had been using to try to detect Parkinson's disease from brain scans. His method involved dimensionality reduction by combining a 2D convolutional neural network (CNN) with a Long Short-Term Memory (LSTM) architecture that models sequences of data. Essentially, the CNN learned two-dimensional vectors from each of the images, and fed them into the LSTM, which could take advantage of the relationships between the frames. This allowed the team to avoid reducing the resolution of the input images. Sam posits that "A lot of winning the competition is being on the winning side of information asymmetry." It can be hard to gauge where you might stand in the competition to make sure your efforts are worth it. David tries to plan ahead by recognizing signs of promise from the beginning, which for him means placing in the top 30 on the leaderboard as a result of his initial efforts in a problem. Distracted Drivers and Data Augmentation Building on challenges with processing image data, another Kaggle competition David participated in was the State Farm Distracted Driver Detection challenge. The problem was to identify distracted drivers by reviewing images to determine whether the driver was doing things like playing with the radio, using the phone or applying makeup. Their unique approach was to implement a creative data augmentation technique to train the model. The technique involved taking for example, two images of a driver playing with the radio. They would then vertically or horizontally combine, for example, 75% of one image with 25% of the other image to get an additional image. By joining them together, the partial image would indicate a distracted driver and should be enough to make the best prediction. Combining images in this way was a solution to avoid overfitting. The data they were dealing with only had a few examples of the distracted driving behavior they were trying to identify in the training set, causing their neural networks to tend to overfit. (That is, they memorized the few examples they found in the training set, and struggled to generalize to an unseen validation set.) By combining the images they both created additional training examples and broke the networks tendency to rely on spurious patterns in identifying examples of distracted driving. Otherwise, though, they used an off-the-shelf model architecture, demonstrating that it's not always a unique architecture that wins the competition. According to David, you don't necessarily need a massive ensemble of models to win Kaggle competitions either: "If you focus on one model, you can almost do as well as a massive ensemble, but oftentimes the ensemble is the easy way out. But the ensembles, there's a cost associated with that, at least for computer vision, in terms of GPU time. If you have infinite compute resources, you might be able to get away with ensembling, but oftentimes you have to weigh the cost of training many models with focusing on one and trying to get it as good as possible." "These are some of my secrets." David has a few tricks to share that apply to everyone, beginners and experts alike: Keep it Simple. "The key...is that these solutions are usually simple...there's this idea that starting Kaggle [is] hard…I feel like a lot of challenges you just have to look at it with a creative approach and just opening your mind that the solution is simple." Persistence. David also emphasizes that "Kaggle can be discouraging." But you have to believe you can do well and give it a shot regardless, even if you don't do well initially. Reading Top Solutions. Another trick is to read the approaches from Kaggle winners so you can compare their solutions with your own to learn from what you could have improved. Additional Tips In addition to the tips David emphasized above, here are a few additional suggestions we gleaned from the interview: Kaggle Discussion Forums. Digging into forums to see what other people are doing is a great way to learn what angles and perspectives others are using that might help you approach the challenge. Teaming Up. In most of his competitions, David has teamed up with others who all bring unique perspectives and help with the challenges. Kernels. Kaggle is collaborative and kernels might be a great place to get you started, but it's also a competition and as David puts it simply, "if you do what everybody else does, you're not going to win." If you're interested in joining Kaggle, or want to be part of a supportive community of folks working on Kaggle projects together, check our Kaggle study group! The group hosts virtual meetups every week. Learn more at http://twimlai.com/program/kaggle-team/. " data-search-guests="David Odaibo">

Cover Image: David Odaibo - Podcast Interview

354

Secrets of a Kaggle Grandmaster

Bruno Gonçalves. "The idea is essentially to look under the microscope of how science works, meaning for example, how it evolves over time, how collaboration occurs between different scientists, in between different fields. How scientists pick their research problems, how they, for example, move across different institutions, how nations develop expertise in different fields of research and so on." In addition to predicting the trajectory of physics research, Matteo is also active in the computational epidemiology field. His work in that area involves building simulators that can model the spread of diseases like Zika or the seasonal flu at a global scale. Science of Science Matteo's background in economics and his interest in human behavior sparked his desire to explore the "science of science." Physics was the natural starting point since he already worked with many individuals in the field. To build his models, Matteo uses a core data set of papers published in the journals of The American Physical Society. This dataset was chosen in part because of the robustness of its classification scheme, the Physics and Astronomy Classification Scheme (PACS), which provides references to affiliated topics, authors and publications for each of the papers in the archive. PACS also provides a consistent set of keywords for each of the papers. These keywords are used to relate the various physics researchers to one another using an embedding model. In Matteo's case, the model they use is StarSpace, developed by Facebook AI Research. As Matteo puts it, "We are treating each author as a bag of topics, a bag of research fields in which that author has worked. Then we use this bag of topics to infer the embeddings for each specific research sub-area." Having created an embedding that relates the various research topics to one another, Matteo and his co-authors then use it to create what they call the Research Space Network (RSN). The RSN is a "mapping of the research space [created] by essentially looking at the expertise of authors to guide us on what it means for two topics to be similar to each other."

Principle of Relatedness One of the main findings from the research so far is what Matteo refers to as a "fingerprint" of the scientific production of cities. The work is based on the idea of The Principle of Relatedness, an economics term that aims to measure the relationship between a nation's overall production, exports, expertise, and trade partners to predict what items the country should export next. In applying this idea to their research, Matteo would look at all of the scientific publications from a city and use the embedding space to measure the level of relatedness, and predict the direction of the city's scientific knowledge. You can use a network to visually show the interactions between different vectors (science topics) and rank the probability that a city will enter a specific field. That ranking becomes your "classifier" and allows you to determine where that field will or will not be developed next. If you were to plot out the topics of existing research in a city, you could see where the "knowledge density" collects, and note where the density is high, to predict the trajectory of research. If a country is in an intermediate stage of development, there's a higher chance of "jumping" to a different space. Focus and Limitations The focus, for now, is to find the best way of creating embeddings for a very specific problem, not for a variety of tasks. For example, there is no weighting of a researcher's volume of work or its relative importance--the associations include anything they've been active in. Likewise, for some analyses, you might want to identify where the scientist is most active and remove any side projects or abandoned subjects. None of these are considered in this paper. Rather, Matteo approaches the problem from the simplest possible scenario, effectively asking "What if we are blind?" "We...get a big pile of papers from an author. We just list all the topics in which he has worked on and train on that." They want to prove that you do not need to perform manual checks and optimizations to get useful results. Performance Metrics Matteo tested the results using a couple of different validations: One approach was to visualize the RSN and regional fingerprints for assessment. This made it easy to see the macro areas where the PACS classification distinguishes the different subfields of physics. This hierarchy was not used at training time and the algorithm was able to determine the right classification. The second method was to measure the predictive power of the algorithm by looking at each city at a given time period and listing the topics where they had a competitive advantage. Then they compared them using a standard metric like an ROC curve to see if the model was performing better than a random model. What's Next? While the goal is to eventually expand and apply these techniques to entire papers (vs just the PACS keywords), having a predetermined taxonomy and hierarchical structure laid out gives them a benchmark to validate their own observations. Scaling this approach to other fields is something they are starting to work on. They've made some progress using the Microsoft Academic Graph which includes all the different fields in science. As of now, they can't replicate the results they get when they apply the algorithm to physics, but the potential for the embedding space can be evolved for tracking things like the semantics of a term over time, or how authors tend to move in this space. There's also the possibility of finding gaps in the science and making connections that the field might not know to make. " data-search-guests="Matteo Chinazzi">

Cover Image: Matteo Chinazzi - Podcast Interview

353

NLP for Mapping Physics Research

Online Classification with Complex Metrics on making models that optimize complex, non-decomposable metrics. (Non-decomposable here means you can't write the metric as an average, which would allow you to apply existing tools like gradient descent.) Scaling up to More Complex Measures To generalize this idea beyond simple binary classifiers, we have to think about the confusion matrix, which is a key statistical tool used in assessing classifiers. The confusion matrix measures the distribution of predictions that a classifier makes given an input with a certain label.

Sanmi's research provided guidance for building models that optimized arbitrary metrics based on the confusion matrix. "Initially we work[ed out] linear weighted combinations. Eventually, we got to ratios of linear things, which captures things like F-measure. Now we're at the point where we can pretty much do any function of the confusion matrix." Domain Experts and Metric Elicitation Having developed a framework for optimizing classifiers against complex performance metrics, the next question Sanmi asked (because it was the next question asked of him), is which one should you choose for a particular problem? This is where metric elicitation comes in. The idea is to flip the question around and try to determine good metrics for a particular problem by interacting with experts or users to determine which of the metrics we can now optimize for best approximate how the experts are making trade-offs against various types of predictions or classification errors. For example, a doctor understands the costs associated with diagnosing or misdiagnosing someone with a disease. The trade-off factors could include treatment prices or side effects--factors that can be compressed to the pros/cons of predicting a diagnosis or not. Building a trade-off function for these decisions is difficult. Metric elicitation allows us to identify the preferences of doctors through a series of interactions with them, and to identify the trade-offs that should correspond to their preferences." Once we know these trade-offs, we can build a metric that captures them, which allows you to optimize those preferences directly in your models using the techniques Sanmi developed earlier. In research developed with Gaurush Hiranandani and other colleagues at the University of Illinois, Performance Metric Elicitation from Pairwise Classifier Comparisons proposes a system of asking experts to rank pairs of preferences, kind of like an eye exam for machine learning metrics. Metric Elicitation and Inverse Reinforcement Learning Sanmi notes that learning metrics in this manner is similar to inverse reinforcement learning, where reward functions are being learned, often by interaction with humans. However, the fields differ in that RL is more focused on replicating behavior rather than getting the reward function correct. Metric elicitation, on the other hand, is focused on replicating the same decision-making reward function as the human expert. Matching the model's reward function, as opposed to the model's behavior, has the benefit of greater generalizability, which allows metrics that are agnostic to data distribution or the specific learner you're using. Sanmi mentions another interesting area of application around fairness and bias, where you have different measures of fairness that correspond to different notions of trade-offs. Upcoming research is focused on finding "elicitation procedures that build context-specific notions of metrics or statistics" that should be normalized across groups to reach a fairness goal in a specific setting. Robust Distributed Learning This interview also covers Sanmi's research into robust distributed learning, which aims to harden distributed machine learning systems against adversarial attacks. Be sure to check out the full interview for the interesting discussion Sam and Sanmi had on both metric elicitation and robust distributed learning. The latter discussion starts about 33 minutes into the interview. " data-search-guests="Sanmi Koyejo">

352

Metric Elicitation and Robust Distributed Learning

Recent Advances in Algorithmic High-Dimensional Robust Statistics. The survey covers an overview of about a 100 papers exploring the techniques that have been developed in the space so far, and evaluates what direction the community should go in next. The survey will be published soon as part of the book Beyond Worst-Case Analysis. Practical Implications: Data Poisoning and Implementation One of the practical implications of robust statistics is the prevention of data poisoning. Data poisoning is the phenomenon that occurs when you have incoming data from outside and the system is vulnerable to malicious users who insert fake data that destroys the behavior of the model. While the potential for applications is large, what's holding back widespread implementation is that the algorithms use spectral methods, and are not automatic as the machine learning community wants them to be. Further, many real-world problems are non-convex, meaning that SGD can't be applied directly. Ilias believes that can change soon, and is currently working on eliminating the algorithmics and giving structure to these non-convex problems to formulate them in such a way that SGD can sufficiently solve them. " data-search-guests="Ilias Diakonikolas">

Cover Image: Ilias Diakonikolas - Podcast Interview

351

High-Dimensional Robust Statistics

attention for being among the first to publicly warn about the coronavirus (COVID-19) that initially appeared in the Chinese city of Wuhan. How did the company's system of data gathering techniques and algorithms help flag the potential dangers of the disease? In this interview, Kamran shares how they use a variety of machine learning techniques to track, analyze and predict infectious disease outbreaks. As a practicing physician based in Toronto, Kamran was directly impacted by the SARS outbreak in 2003. "We saw our hospitals completely overwhelmed. They went into lockdown. All elective procedures were canceled...even the city took on a different feel...there were billions of financial losses...and Toronto was just one of dozens." In the wake of that crisis, governments have been slow to act. Efforts like the International Health Regulations Treaty (2005), which aims to standardize communication about diseases, help but are not well enforced. It doesn't help that these nations are often unaware of the severity of an outbreak, or are hesitant to report a threat because of potential economic consequences. Ultimately, his experience with the SARS crisis led Kamran to explore the role technology might play in anticipating outbreaks and predicting how they might spread. Kamran's insight ultimately lead to the creation of BlueDot, which applies machine learning to four main challenges in infectious disease tracking: Surveillance, Dispersion, Impact, and Communication. [caption id="attachment_7154" align="aligncenter" width="1024"]

Surveillance The BlueDot engine gathers data on over 150 diseases and syndromes around the world, looking at over 100,000 online articles each day spanning 65 languages, searching every 15 minutes, 24 hours a day. This includes official data from organizations like the Center for Disease Control or the World Health Organization, but also counts on less structured, local information from journalists and healthcare workers. BlueDot's epidemiologists and physicians manually classified the data and developed a taxonomy so relevant keywords could be scanned efficiently. They later applied ML and NLP to train the system. Kamran points out that the algorithms in place perform "relatively low-complexity tasks, but they're incredibly high volume and there's an enormous amount of them, so we can simply train a machine to replicate our judgment [for classifying]". As a result of their system's algorithms, only a handful of cases are flagged for human experts to analyze. In the case of COVOID-19, the system highlighted articles in Chinese that reported 27 pneumonia cases associated with a market that had seafood and live animals in Wuhan. Dispersion Recognizing the role that travel plays in disease dispersion—especially in the age of air travel—BlueDot uses geographic information system (GIS) data and flight ticket sales to create a dispersion graph for each disease based on the airports connected to a city and where passengers are likely to fly. Not everyone travels by air, so they also use anonymized location data from 400 million mobile devices to track flows from outbreak epicenters to other parts of the region or world. The locations receiving the highest volume of travelers are identified and diligently evaluated for what the impact of the disease could be in the area. For COVOID-19, BlueDot applied this methodology to identify many of the cities among the first to receive the coronavirus, including Tokyo, Bangkok, Hong Kong, Seoul, and Taipei. Impact Once a virus leaves its region of origin, a wide variety of factors determine whether it will ultimately die out or grow into a full-fledged outbreak: A region may have better or worse public health infrastructure, hospitable or inhospitable climates, or varying economic resources. BlueData's systems consider factors such as these to predict the potential impact on an identified area. For example, if a virus is being spread by ticks, and Vancouver is in the middle of winter snow, the likelihood of an outbreak is very low because ticks would not survive that climate. However, the same virus might thrive in a humid environment like Florida, making the region at-risk for an outbreak. Communication If an area is determined to be at-risk, the focus shifts to providing early warnings to health officials, hospitals, airlines, and government agencies in public health, national defense, national security and even agriculture. Kamran reiterates the importance of providing only the most relevant information to those who need it, referencing the ideas Clay Shirky and his 2008 talk], "It's Not Information Overload. It's Filter Failure. BlueDot first became aware of the pneumonia cases in Wuhan on December 31st, and in addition to notifying their clients and government stakeholders directly, they publicly released their findings in the Journal of Travel Medicine on January 14th. Criticism and Limitations These are incredibly difficult predictions to make, and the science behind the transmission of infectious diseases is complex and evolving every day. So, what is the proper role of technology? Kamran asserts that "by no means would [they] claim that AI has got this problem solved. It's just one of the tools in the toolbox." In some cases, Kamran and his team may lack sufficient observations to develop a machine learning model for a particular disease. For this and other reasons, the company relies on a combination of approaches and a diverse team of specialists in their work. With coronavirus already in full swing, BlueDot is looking more heavily at analyzing location data from mobile devices to provide a real-time understanding of how people are moving around. However, Kamran compares this to predicting the weather—the further ahead you're looking, the less accurate your prediction. Despite the limitations, Kamran reinforces the value of the work by acknowledging that "Manually, it would take a hundred people around the clock [to process the data], and we have four people and a machine." " data-search-guests="Kamran Khan">

Cover Image: Kamran Khan - Podcast Interview

350

How AI Predicted the Coronavirus Outbreak

Building Machine Learning Powered Applications: Going from Idea to Product. Emmanuel began his career as a data scientist and went on to mentor over a hundred Ph.D. fellows looking to transition into machine learning as an AI program lead at Insight Data Science. His new book is the culmination of what he learned, and provides a guide for aspiring and practicing engineers and data scientists on how to approach ML projects systematically. Structuring End-to-End Machine Learning Projects In this interview, as in the book, Emmanuel shares his best practices for structuring and building projects. Emmanuel approaches new ML projects in four main stages:

Formulating the problem and creating a plan: Here we want to think about the best possible approach to solving our specific problem. The goal is to simplify, simplify, simplify, and have a clear understanding of what your success metrics are before you start to build anything.

Building a working pipeline and acquiring an initial dataset: Emmanuel recommends building an end-to-end data processing pipeline, albeit a simple one, right from the start, and walks us through how to test and evolve it. Like your pipeline, your dataset is also something you'll want to iterate on. Your data should inform your features and models, and not the other way around.

Iterate on your models: Model development is inherently iterative, and Emmanuel shares his approach to developing and evaluating models. The latter depends on your ability to successfully chose an evaluation metric that is most appropriate for your problem, and tools like confusion matrices, ROC curves, a calibration curves, and various approaches to visualization can all come into play when trying to debug your models. Evaluating feature importance can also help here, as it allows you to check your assumptions about the problem.

Deployment and monitoring: A number of non-technical and technical considerations come into play when unleashing your models to the real world. First off, we need to consider the ethical implications of our models as well as concerns like data ownership and bias. From a technical perspective, we need to choose a deployment option that makes sense for the way the model will be accessed by its users. We also want to build safeguards and sanity checks to protect us from model failures, and monitor the model's predictions over time.

Emmanuel emphasizes the broader end-to-end ML process over just the raining of the model because the growing maturity of ML and AI tools has simplified that part of the process. "It used to be that training up a model was pretty hard, and you needed a team of people that understood the internals deeply, but now because the tooling has evolved so much...and the tuning is so good, and the courses are so good..., it becomes relatively simpler than the rest [of the process]." To support the concepts taught in the book, readers are guided through the process of building a predictive text application that combines heuristics, rules, models and engineering work. Emmanuel had an extensive thought process behind choosing predictive texts as the best learning example: "Writing and assisting people to write better is a crucial example where you can check for grammar (that's just rules), or you can check for vocabulary or a variety of things, and you can also help them improve their style... So it was a nice blend that reflects what happens in the real world." To further explore Emmanuel's take on building end-to-end ML pipelines to realize your ideas, be sure to check out the interview. We only scratched the surface in this article. And of course, we highly recommend the book, which is available here! (You support the show by using our affiliate links to purchase your copy of the book.) " data-search-guests="Emmanuel Ameisen">

349

Turning Ideas into ML Powered Products

The Extended Mind, by Andy Clark and David Chalmers, which suggests that the smartphone has become an extension of the human mind. But what Abeba emphasizes as most important are the disparities in how different groups of people are impacted by technology shifts, and the connection between privilege and control over that impact. AI is just the latest in a series of technological disruptions, and as Abeba notes, one with the potential to negatively impact disadvantaged groups in significant ways. Harm of Categorization from ML Predictions The inherent nature of so much of modern machine learning is to make predictions. An ethical approach to AI demands that we ask hard questions about those impacted by these predictions and assess the "harm of categorization." When an AI algorithm predicts that someone is more likely to be a criminal, less likely to be successful, or less qualified to receive credit, these predictions pose dangers that disproportionately impact disadvantaged populations versus those in more privileged positions. Abeba's paper, Algorithmic Injustices Toward Relational Ethics, which recently won the best paper award at the Black in AI Workshop at NeurIPS, posits relational ethics and the relational mindset as a rethinking of those predictions. In other words, the question we should be asking is, why are certain demographics more at risk and how do we protect the welfare of those individuals most vulnerable to the social consequences of reductive labeling? Her work also highlights that machine learning practices often rely on the assumption that the conditions that they model are stable. This comes from the IID assumption, which means that data points are independent and identically distributed. For example, you might be a certain way at work, but at a party, you speak or act differently. This "code-switching" is natural to humans but breaks ML algorithms' tendency to see one's actions as arising from a single distribution. For the most part, this dynamism is not something that ML sufficiently accounts for. As Abeba points out, the "nature of reality is that it is never stable... it is constantly changing." So, machine learning cannot be the final answer, it "cannot stabilize this continually moving nature of being." A relational ethics approach, however, accounts for change and assumes that solutions must be revised with time. Robot Rights vs. Human Welfare Abeba recently published another paper with her colleague Jelle van Dijk from the University of Twente, called "Robot Rights? Let's Talk about Human Welfare Instead." Like all good things, the paper came to life after a series of debates circling on Twitter and it comes down to two major concepts:

Robots < humans. That is to say, robots cannot be granted or denied rights because machines are not the same as humans or any living being. Their argument rested on a "philosophical post-Cartesian approach" (translation: you need a brain to exist in the world because your being and knowing are sourced in the mind, which is embodied and enacted through a social environment. Robots arguably don't have conscious minds, nor do they have an embodied biological presence that constitutes existence as a "being" in the world around them).
AI is not truly autonomous and never will be. This is because there is always a human involved to some degree. Another layer to this is the oversight of labor from "micro-workers" who contribute to AI without being acknowledged (like when you have to choose pictures of stop signs to prove you're not a bot).

Abeba and Jelle believe we've got way too many human-related issues with AI to worry about being nice to robots. This might be true, but as Sam jokingly points out, "People love their robots, I guess." Reframing the AI Ethics Conversation Throughout the interview and in her paper, Abeba raises a ton of compelling points in favor of reframing the AI Ethics conversation. How ML systems can account for these issues pragmatically is still a very tricky problem. For Abeba, "the best one can do [at the moment] is acknowledge this change in context and live this partial openness, and embrace reiteration and revision." Part of the process involves an active commitment to prioritizing understanding rather than prediction. But a shift in values might be a slow process. " data-search-guests="Abeba Birhane">

Cover Image: Abeba Birhane - Podcast Interview

348

Algorithmic Injustices: Towards a Relational Ethics

here.

Crop Masking. Name that tree! This is essentially a classification task in which Gro seeks to identify what type of crop is growing in each pixel of a satellite image.The challenge is that conditions change often and distinguishing between an orange tree and a tangerine crop might be easier said than done.

Droughts. Droughts are a major threat to farming and food production. To date, there is no standard international drought index that the world can agree on, and Gro wants to change that by analyzing environmental conditions to create an objective benchmark for severe droughts.

Knowledge Graph Automation. Gro ingests data from dozens of sources and that information needs to be organized into a common, structured, ontology or knowledge graph. Gro uses machine learning models to automate this task. Gro's knowledge graph automation models help extract data and update how it flows into the Gro knowledge graph.

The Data is So Good Gro's models ingest "wildly different data types" to support the company's models and allow them to get a sense of a dynamic agriculture market. The majority, at least in volume, comes from satellite data, spanning the entire frequency range of the electromagnetic spectrum, including visible, ultraviolet, and infrared. This helps Gro deduce a wealth of information about crop growth and growing conditions around the globe. In addition to satellite imagery, the company also collects a huge amount of time series data, many originating in PDFs or worse, scanned paper reports issued by local governments. The company's database currently has over 55 million data series and the amount is doubling every 6-9 months. Reproducibility and attribution are extremely important and ensure that each data point can be traced back to where it came from. Despite the overwhelming amount of data sources, the amount is not always sufficient. That's where Gro's own derived data series come into play. This method applies the company's machine learning models to data from multiple sources to create new, insightful data series. This helps users overcome data inconsistencies that might be found in any individual source. For the most part, the data Gro collects is surprisingly clean. As Nemo notes, it's "hard to lie to a satellite." Try me. Modeling Lessons Learned To deal with their scale, Gro has had to learn many lessons about developing effective machine learning models in agriculture. The keys to their success, according to Nemo, lie in:

Choosing what to model. Gro has to carefully determine criteria to answer whether it is an important and economically interesting problem for their user base.
Don't come at a problem with a solution. This involves remaining "agnostic to technology" and being prepared to try different approaches to each issue.
Build for the masses. The company actively builds general frameworks that can be applied to different situations and geographic regions.
Pause, then go. Before launching a set of models, they evaluate the performance in unique ways such as looking at how the error is distributed spatially or its temporal distribution performance. They bring in domain expertise to figure out feature engineering and tweaks to have a good model.

Nemo points out that while they are still continuing to develop these methods, the past few years are already showing improvements in accuracy with more rigorous data acquisition. " data-search-guests="Nemo Semret">

Cover Image: Nemo Semret - Podcast Interview

347

AI for Agriculture and Global Food Security

differential privacy, a topic we've covered here on the show quite extensively over the years. Differential privacy is a system for publicly sharing information about a dataset by describing patterns of groups within the dataset, the catch is you have to do this without revealing information about individuals in the dataset (privacy). Ryan currently applies differential privacy at LinkedIn, but he has worked in the field, and on the related topic of federated learning, for quite some time. He was introduced to the subject as a PhD student at the University of Pennsylvania, where he worked closely with Aaron Roth, who we had the pleasure of interviewing back in 2018. Ryan later worked at Apple, where he focused on the local model of differential privacy, meaning differential privacy is performed on individual users' local devices before being collected for analysis. (Apple uses this, for example, to better understand our favorite emojis 🤯 👍👏). Not surprisingly, they do things a bit differently at LinkedIn. They utilize a central model, where the user's actual data is stored in a central database, with differential privacy applied before the data is made available for analysis. (Another interesting use case that Ryan mentioned in the interview: the U.S. Census Bureau has announced plans to publish 2020 census data using differential privacy.) Ryan recently put together a research paper with his LinkedIn colleague, David Durfee, that they presented as a spotlight talk at NeurIPS in Vancouver. The title of the paper is a bit daunting, but we break it down in the interview. You can check out the paper here: Practical Differentially Private Top-k Selection with Pay-what-you-get Composition. There are two major components to the paper. First, they wanted to offer practical algorithms that you can layer on top of existing systems to achieve differential privacy for a very common type of query: the "Top-k" query, which means helping answer questions like "what are the top 10 articles that members are engaging with across LinkedIn?" Secondly, because privacy is reduced when users are allowed to make multiple queries of a differentially private system, Ryan's team developed an innovative way to ensure that their systems accurately account for the information the system returns to users over the course of a session. It's called Pay-what-you-get Composition. [caption id="attachment_7063" align="aligncenter" width="600"]

This is a picture that Sam drew to show what's happening here.
[/caption] One of the big innovations of the paper is discovering the connection between a common algorithm for implementing differential privacy, the exponential mechanism, and Gumbel noise, which is commonly used in machine learning.

One of the really nice connections that we made in our paper was that actually the exponential mechanism can be implemented by adding something called Gumbel noise, rather than Laplace noise. Gumbel noise actually pops up in machine learning. It's something that you would do to report the category that has the highest weight, [using what is] called the Gumbel Max Noise Trick. It turned out that we could use that with the exponential mechanism to get a differentially private algorithm. [...] Typically, to solve top-k, you would use the exponential mechanism k different times⁠ —you can now do this in one shot by just adding Gumbel noise to [existing algorithms] and report the k values that are in the the top […]which made it a lot more efficient and practical.

When asked what he was most excited about for the future of differential privacy Ryan cited the progress in open source projects.

This is the future of private data analytics. It's really important to be transparent with how you're doing things, otherwise if you're just touting that you're private and you're not revealing what it is, then is it really private?

He pointed out the open-source collaboration between Microsoft and Harvard's Institute for Quantitative Social Sciences. The project aims to create an open-source platform that allows researchers to share datasets containing personal information while preserving the privacy of individuals. Ryan expects such efforts to bring more people to the field, encouraging applications of differential privacy that work in practice and at scale. Listen to the interview with Ryan to get the full scope! And if you want to go deeper into differential privacy check out our series of interviews on the topic from 2018.

Thanks to LinkedIn for sponsoring today's show! LinkedIn Engineering solves complex problems at scale to create economic opportunity for every member of the global workforce. AI and ML are integral aspects of almost every product the company builds for its members and customers. LinkedIn's highly structured dataset gives their data scientists and researchers the ability to conduct applied research to improve member experiences. To learn more about the work of LinkedIn Engineering, please visit engineering.linkedin.com/blog. " data-search-guests="Ryan Rogers">

Cover Image: Ryan Rogers - Podcast Interview

346

Practical Differential Privacy at LinkedIn

Cover Image: Erez Cohen - Podcast Interview

345

Networking Optimizations for Multi-Node Deep Learning on Kubernetes

Cover Image: Bob Killen - Podcast Interview

344

Managing Research Needs at the University of Michigan using Kubernetes

Cover Image: Haytham, Ketan Umare - Podcast Interview

343

Scalable and Maintainable Workflows at Lyft

342

Causality 101

Cover Image: Jannis Born - Podcast Interview

341

PaccMannRL: Designing Anticancer Drugs with Reinforcement Learning

Cover Image: Blaise Aguera y Arcas - Podcast Interview

340

Social Intelligence

339

Music & AI Plus A Geometric Perspective on Reinforcement Learning

@samcharrington or @twimlai. To follow along with the 2019 AI Rewind Series, head over to the series page! " data-search-guests="Nasrin Mostafazadeh">

337

AI Rewind 2019: Trends in Natural Language Processing

@samcharrington or @twimlai! " data-search-guests="Amir Zamir">

Cover Image: Amir Zamir - Podcast Interview

338

AI Rewind 2019: Trends in Computer Vision

@samcharrington or @twimlai. To follow along with the 2019 AI Rewind Series, head over to the series page! " data-search-guests="Timnit Gebru">

336

AI Rewind 2019: Trends in Fairness and AI Ethics

@samcharrington or @twimlai. To follow along with the 2019 AI Rewind Series, head over to the series page! " data-search-guests="Chelsea Finn">

Cover Image: Chelsea Finn - Podcast Interview

335

AI Rewind 2019: Trends in Reinforcement Learning

@samcharrington or @twimlai. To follow along with the 2019 AI Rewind Series, head over to the series page! " data-search-guests="Zachary Lipton">

334

AI Rewind 2019: Trends in Machine Learning & Deep Learning

Cover Image: Mohamed Sidahmed - Podcast Interview

333

FaciesNet & Machine Learning Applications in Energy

Cover Image: Daphne Koller - Podcast Interview

332

Machine Learning: A New Approach to Drug Discovery

Cover Image: Blake Richards - Podcast Interview

331

Sensory Prediction Error Signals in the Neocortex

Cover Image: Celeste Kidd - Podcast Interview

330

How to Know

Cover Image: Feng Yan - Podcast Interview

329

Using Deep Learning to Predict Wildfires

Cover Image: Dave Castillo - Podcast Interview

328

Advancing Machine Learning at Capital One

Cover Image: Bryton Shang - Podcast Interview

327

Helping Fish Farmers Feed the World with Deep Learning

326

Metaflow, a Human-Centric Framework for Data Science

Cover Image: Stephen Merity - Podcast Interview

325

Single Headed Attention RNN: Stop Thinking With Your Head

Cover Image: Scott Clark - Podcast Interview

324

Automated Model Tuning

Cover Image: Eraz Barak - Podcast Interview

323

Automated Machine Learning

322

Responsible AI in Practice

Cover Image: Jordan Edwards - Podcast Interview

321

Enterprise Readiness, MLOps and Lifecycle Management

Cover Image: Luke Marsden - Podcast Interview

320

DevOps for ML

Cover Image: Mike Tung - Podcast Interview

319

Building an Autonomous Knowledge Graph

Cover Image: Aaron Ma - Podcast Interview

318

The Next Generation of Self-Driving Engineers

Cover Image: Terrence Sejnowski - Podcast Interview

317

Spiking Neural Networks: A Primer

Cover Image: Xavier Amatriain - Podcast Interview

316

Bridging the Patient-Physician Gap with ML and Expert Systems

315

What Does it Mean for a Machine to “Understand”?

Cover Image: Jonathan Hung - Podcast Interview

314

Scaling TensorFlow at LinkedIn

Cover Image: Omoju Miller - Podcast Interview

313

Machine Learning at GitHub

Cover Image: Archana Venkataraman - Podcast Interview

312

Using AI to Diagnose and Treat Neurological Disorders with Archana Venkataraman

Cover Image: Phoebe DeVries, Brendan Meade - Podcast Interview

311

Deep Learning for Earthquake Aftershock Patterns

310

Live from TWIMLcon! Operationalizing Responsible AI

309

Live from TWIMLcon! Scaling ML in the Traditional Enterprise

308

Live from TWIMLcon! Culture & Organization for Effective ML at Scale

Cover Image: Franziska Bell - Podcast Interview

307

Live from TWIMLcon! Use-Case Driven ML Platforms

Cover Image: Hussein Mehanna - Podcast Interview

306

Live from TWIMLcon! Operationalizing ML at Scale

Cover Image: Deepak Agarwal - Podcast Interview

305

Live from TWIMLcon! Encoding Company Culture in Applied AI Systems

Cover Image: Andrew Ng - Podcast Interview

304

Live from TWIMLcon! Overcoming the Barriers to Deep Learning in Production

Cover Image: Alexandre Bayen - Podcast Interview

303

The Future of Mixed-Autonomy Traffic

302

Deep Reinforcement Learning for Logistics at Instadeep

Cover Image: Mark Ryan - Podcast Interview

301

Deep Learning with Structured Data

Cover Image: Kalai Ramea - Podcast Interview

300

Time Series Clustering for Monitoring Fueling Infrastructure Performance

Cover Image: Gregg Wilcox - Podcast Interview

299

Swarm AI for Event Outcome Prediction

Cover Image: Gary Marcus - Podcast Interview

298

Rebooting AI: What’s Missing, What’s Next

Cover Image: Brian Burke - Podcast Interview

297

DeepQB: Deep Learning to Quantify Quarterback Decision-Making

Cover Image: Lotte Bransen - Podcast Interview

296

Measuring Performance Under Pressure Using ML

295

Managing Deep Learning Experiments

Cover Image: Angela Bassa - Podcast Interview

294

Re-Architecting Data Science at iRobot

Cover Image: Olivier Bachem - Podcast Interview

293

Disentangled Representations & Google Research Football

267. In our conversation with Tijmen, we discuss the ins and outs of compression and quantization of ML models, including how much models can actually be compressed, and the best way to achieve it. We also look at the recent "Lottery Hypothesis" paper and how that factors into this research, and best practices for training efficient networks. Finally, Tijmen recommends a few algorithms for those interested, including tensor factorization and channel pruning. " data-search-guests="Tijmen Blankevoort">

Cover Image: Tijmen Blankevoort - Podcast Interview

292

Neural Network Quantization and Compression

Cover Image: Anubhav Jain - Podcast Interview

291

Identifying New Materials with NLP

Cover Image: Cynthia Rudin - Podcast Interview

290

The Problem with Black Boxes

Cover Image: Kate Darling - Podcast Interview

289

Human-Robot Interaction and Empathy

Cover Image: Danny Stoll - Podcast Interview

288

Automated ML for RNA Design

Cover Image: Theofanis Karayannis - Podcast Interview

287

Developing a Brain Atlas Using Deep Learning

Cover Image: Emma Strubell - Podcast Interview

286

Environmental Impact of Large-Scale NLP Model Training

285

Fairwashing and the Folly of ML Solutionism

Cover Image: Stephen Odaibo - Podcast Interview

284

Retinal Image Generation for Disease Discovery

Cover Image: Rayid Ghani - Podcast Interview

283

Real World Model Explainability

Cover Image: Michael Levin - Podcast Interview

282

Inspiring New Machine Learning Platforms with Bioelectric Computation

Cover Image: Erhan Batuhan Arisoy - Podcast Interview

281

Simulation and Synthetic Data for Computer Vision

Max Welling, Qualcomm has a hand in tons of machine learning research and hardware, and our conversation with Jeff is no different. We discuss how the various training frameworks fit into the developer experience when working with their chipsets, examples of federated learning in the wild, the role inference will play in data center devices and more. " data-search-guests="Jeff Gehlhaar">

280

Spiking Neural Nets and ML as a Systems Challenge

Cover Image: Aditya Bhashyam, Daniel Jeavons - Podcast Interview

279

Transforming Oil & Gas with AI

TWIMLcon conference, which will focus on the tools, technologies, and practices necessary to scale the delivery of machine learning and AI in the enterprise. The event will be held October 1st & 2nd in San Francisco and early bird registration is open today at twimlcon.com. " data-search-guests="Yunfan Gerry Zhang">

Cover Image: Yunfan Gerry Zhang - Podcast Interview

278

Fast Radio Burst Pulse Detection

TWIMLcon conference, which will focus on the tools, technologies, and practices necessary to scale the delivery of machine learning and AI in the enterprise. The event will be held October 1st & 2nd in San Francisco and early bird registration is open today at twimlcon.com. " data-search-guests="Laurence Watson">

Cover Image: Laurence Watson - Podcast Interview

277

Tracking CO2 Emissions with Machine Learning

TWIMLcon conference, which will focus on the tools, technologies, and practices necessary to scale the delivery of machine learning and AI in the enterprise. The event will be held October 1st & 2nd in San Francisco and early bird registration is open today at twimlcon.com. " data-search-guests="William Fehlman">

Cover Image: William Fehlman - Podcast Interview

276

Topic Modeling for Customer Insight at USAA

TWIMLcon conference, which will focus on the tools, technologies, and practices necessary to scale the delivery of machine learning and AI in the enterprise. The event will be held October 1st & 2nd in San Francisco and early bird registration is open today at twimlcon.com. " data-search-guests="Judy Gichoya">

Cover Image: Judy Gichoya - Podcast Interview

275

Phronesis of AI in Radiology

TWIMLcon conference, which will focus on the tools, technologies, and practices necessary to scale the delivery of machine learning and AI in the enterprise. The event will be held October 1st & 2nd in San Francisco and early bird registration is open today at twimlcon.com. " data-search-guests="Karen Levy">

Cover Image: Karen Levy - Podcast Interview

274

The Ethics of AI-Enabled Surveillance

Cover Image: Scott Clark, Matthew Adereth - Podcast Interview

273

Supporting Rapid Model Development at Two Sigma

Cover Image: Kelley Rivoire - Podcast Interview

272

Scaling Model Training with Kubernetes at Stripe

Cover Image: Yi Zhuang - Podcast Interview

271

Productizing ML at Scale at Twitter

Cover Image: Alex Jason Ratner - Podcast Interview

270

Snorkel: A System for Fast Training Data Creation

269

Advancing Autonomous Vehicle Development Using Distributed Deep Learning

Cover Image: David Ferrucci - Podcast Interview

268

Are We Being Honest About How Difficult AI Really Is?

267

Gauge Equivariant CNNs, Generative Models, and the Future of AI

Cover Image: Genevera Allen - Podcast Interview

266

Can We Trust Scientific Discoveries Made Using Machine Learning?

Cover Image: Ahmed Elgammal - Podcast Interview

265

Creative Adversarial Networks for Art Generation

Cover Image: Rebecca Bilbro - Podcast Interview

264

Diagnostic Visualization for Machine Learning with YellowBrick

Cover Image: Brian Mcfee - Podcast Interview

263

Librosa: Audio and Music Processing in Python

Cover Image: Ines Montani - Podcast Interview

262

Practical Natural Language Processing with spaCy and Prodigy

Cover Image: Luciano Resende - Podcast Interview

261

Scaling Jupyter Notebooks

Cover Image: Delip Rao - Podcast Interview

260

Fighting Fake News and Deep Fakes with Machine Learning

Cover Image: Joanna Bryson - Podcast Interview

259

Maintaining Human Control of AI

Cover Image: Pankaj Goyal, Rochna Dhand - Podcast Interview

258

Intelligent Infrastructure Management

Cover Image: Eric Colson - Podcast Interview

257

Organizing for Successful Data Science at Stitch Fix

Cover Image: Burcu Baran - Podcast Interview

256

End-to-End Data Science to Drive Business Decisions at LinkedIn

Cover Image: Shioulin Sam - Podcast Interview

255

Learning with Limited Labeled Data

Cover Image: Paul Mahler - Podcast Interview

254

cuDF, cuML & RAPIDS: GPU Accelerated Data Science

Cover Image: Trista Chen - Podcast Interview

253

Edge AI for Smart Manufacturing

Cover Image: Nicole Nichols - Podcast Interview

252

Machine Learning for Security and Security for Machine Learning

Cover Image: Gerald Quon - Podcast Interview

251

Domain Adaptation and Generative Models for Single Cell Genomics

TWIML Talk #184, with Viviana Acquaviva, where we explore dark energy and star formation, and if you want to go way back, TWIML Talk #5 with Joshua Bloom which provides a great overview of the application of ML in astronomy. " data-search-guests="Yashar Hezaveh">

Cover Image: Yashar Hezaveh - Podcast Interview

250

Mapping Dark Matter with Bayesian Neural Networks

Cover Image: Dan Schrider - Podcast Interview

249

Deep Learning for Population Genetic Inference

@samcharrington or leave a comment below with your thoughts. " data-search-guests="Rob Walker">

248

Empathy in AI

Cover Image: Thomas Szumowski - Podcast Interview

247

Benchmarking Custom Computer Vision Services at Urban Outfitters

Cover Image: Ewin Tang - Podcast Interview

246

Quantum Machine Learning Algorithms

Cover Image: Peter Wittek - Podcast Interview

245

Pragmatic Quantum Machine Learning

Cover Image: Alfredo Luque - Podcast Interview

244

Supporting TensorFlow at Airbnb

Cover Image: Elena Nieddu - Podcast Interview

243

Mining the Vatican Secret Archives with TensorFlow

Cover Image: Paige Bailey - Podcast Interview

242

Exploring TensorFlow 2.0

Cover Image: Andrew Trask - Podcast Interview

241

Privacy-Preserving Decentralized Data Science

Cover Image: Jos van der Westhuizen - Podcast Interview

240

The Unreasonable Effectiveness of the Forget Gate

Cover Image: Andrew Guldman - Podcast Interview

239

Building a Recommendation Agent for The North Face

Cover Image: Kevin Tran - Podcast Interview

238

Active Learning for Materials Design

Cover Image: Aydogan Ozcan - Podcast Interview

237

Deep Learning in Optics

Cover Image: Hema Raghavan, Scott Meyer - Podcast Interview

236

Scaling Machine Learning on Graphs at LinkedIn

Cover Image: Sicelukwanda Zwane - Podcast Interview

235

Safer Exploration in Deep Reinforcement Learning using Action Priors

234

Dissecting the Controversy around OpenAI's New Language Model

Cover Image: Mira Lane - Podcast Interview

233

Human-Centered Design

Cover Image: Hanna Wallach - Podcast Interview

232

Fairness in Machine Learning

Cover Image: Peter Lee - Podcast Interview

231

AI for Precision Medicine

Cover Image: Justice Amoh Jr - Podcast Interview

230

An Optimized Recurrent Unit for Ultra-Low Power Acoustic Event Detection

Cover Image: Alvin Grissom II - Podcast Interview

229

Pathologies of Neural Models and Interpretability

" data-search-guests="Lucas Joppa">

Cover Image: Lucas Joppa - Podcast Interview

228

AI for Earth

Cover Image: Wendy Chisholm - Podcast Interview

227

AI for Accessibility

" data-search-guests="Justin Spelhaug">

Cover Image: Justin Spelhaug - Podcast Interview

226

AI for Humanitarian Action

Cover Image: Randi Williams - Podcast Interview

225

Teaching AI to Preschoolers

Cover Image: Tim Jurka - Podcast Interview

224

Holistic Optimization of the LinkedIn Newsfeed

Cover Image: Gary Brotman - Podcast Interview

223

AI at the Edge at Qualcomm

222

AI Innovation at CES 2019

Cover Image: Vladimir Bychkovsky - Podcast Interview

221

Self-tuning Services via Real-Time Machine Learning

Cover Image: JJ Espinoza - Podcast Interview

220

Building a Recommender System from Scratch at 20th Century Fox

Cover Image: Solon Barocas - Podcast Interview

219

Legal and policy implications of model interpretability

Cover Image: Siddha Ganju - Podcast Interview

218

AI Rewind 2018: Trends in Computer Vision

Cover Image: Simon Osindero - Podcast Interview

217

AI Rewind 2018: Trends in Reinforcement Learning

Cover Image: Sebastian Ruder - Podcast Interview

216

AI Rewind 2018: Trends in Natural Language Processing

215

AI Rewind 2018: Trends in Machine Learning

214

AI Rewind 2018: Trends in Deep Learning

Cover Image: Nando de Freitas - Podcast Interview

213

Training Large-Scale Deep Nets with RL

Cover Image: David Spiegelhalter - Podcast Interview

212

Making Algorithms Trustworthy

211

Designing Computer Systems for Software

Cover Image: Kathryn Hume - Podcast Interview

210

Operationalizing Ethical AI

Cover Image: Richard Zemel - Podcast Interview

209

Approaches to Fairness in Machine Learning

Cover Image: Parinaz Sobahni - Podcast Interview

208

Trust and AI

Cover Image: Thorsten Joachims - Podcast Interview

207

Unbiased Learning from Biased User Feedback

Cover Image: Jinho Choi - Podcast Interview

206

Language Parsing and Character Mining with Deep Learning

Cover Image: Val Bercovici, Dave McCrory - Podcast Interview

205

Roundup Roundtable 2018

Cover Image: Marisa Boston - Podcast Interview

204

Knowledge Graphs and Expert Augmentation

Cover Image: Stuart Reid - Podcast Interview

203

ML/DL for Non-Stationary Time Series Analysis in Financial Markets and Beyond

As many of you know, part of my work involves understanding the way large companies are adopting machine learning, deep learning and AI. While it's still fairly early in the game, we're at a really interesting time for many companies. With the first wave of ML projects at early adopter enterprises starting to mature, many of them are asking themselves how can they scale up their ML efforts to support more projects and teams. Part of the answer to successfully scaling ML is supporting data scientists and machine learning engineers with modern processes, tooling and platforms. Now, if you've been following me or the podcast for a while, you know that this is one of the topics I really like to geek out on. Well, I'm excited to announce that we'll be exploring this topic in depth here on the podcast over the next several weeks. You'll hear from folks building and supporting ML platforms at a host of different companies. We'll be digging deep into the technologies they're deploying to accelerate data science and ML development in their companies, the challenges they're facing, what they're excited about, and more. In addition, as part of this effort, I'm publishing a series of eBooks on this topic. The first of them takes a bottoms-up look at AI platforms and is focused on the open source Kubernetes platform which is used to deliver scalable ML and infrastructure at OpenAI, Booking.com, Matroid and many more companies. It'll be available soon on the TWIML web site, and will be followed shortly thereafter by the second book in the series which looks at scaling data science and ML engineering from the top down, exploring the internal platforms companies Facebook, Uber, and Google have built, the process disciplines that they embody, and what enterprises can learn from them. If this is a topic you're interested in, I'd encourage you to visit twimlai.com/aiplatforms and sign up to be notified as soon as these books are published. " data-search-guests="Daniel Jeavons">

Cover Image: Daniel Jeavons - Podcast Interview

202

Industrializing Machine Learning at Shell

As many of you know, part of my work involves understanding the way large companies are adopting machine learning, deep learning and AI. While it's still fairly early in the game, we're at a really interesting time for many companies. With the first wave of ML projects at early adopter enterprises starting to mature, many of them are asking themselves how can they scale up their ML efforts to support more projects and teams. Part of the answer to successfully scaling ML is supporting data scientists and machine learning engineers with modern processes, tooling and platforms. Now, if you've been following me or the podcast for a while, you know that this is one of the topics I really like to geek out on. Well, I'm excited to announce that we'll be exploring this topic in depth here on the podcast over the next several weeks. You'll hear from folks building and supporting ML platforms at a host of different companies. We'll be digging deep into the technologies they're deploying to accelerate data science and ML development in their companies, the challenges they're facing, what they're excited about, and more. In addition, as part of this effort, I'm publishing a series of eBooks on this topic. The first of them takes a bottoms-up look at AI platforms and is focused on the open source Kubernetes platform which is used to deliver scalable ML and infrastructure at OpenAI, Booking.com, Matroid and many more companies. It'll be available soon on the TWIML web site, and will be followed shortly thereafter by the second book in the series which looks at scaling data science and ML engineering from the top down, exploring the internal platforms companies Facebook, Uber, and Google have built, the process disciplines that they embody, and what enterprises can learn from them. If this is a topic you're interested in, I'd encourage you to visit twimlai.com/aiplatforms and sign up to be notified as soon as these books are published. " data-search-guests="Leemay Nassery">

Cover Image: Leemay Nassery - Podcast Interview

201

Resurrecting a Recommendations Platform at Comcast

As many of you know, part of my work involves understanding the way large companies are adopting machine learning, deep learning and AI. While it's still fairly early in the game, we're at a really interesting time for many companies. With the first wave of ML projects at early adopter enterprises starting to mature, many of them are asking themselves how can they scale up their ML efforts to support more projects and teams. Part of the answer to successfully scaling ML is supporting data scientists and machine learning engineers with modern processes, tooling and platforms. Now, if you've been following me or the podcast for a while, you know that this is one of the topics I really like to geek out on. Well, I'm excited to announce that we'll be exploring this topic in depth here on the podcast over the next several weeks. You'll hear from folks building and supporting ML platforms at a host of different companies. We'll be digging deep into the technologies they're deploying to accelerate data science and ML development in their companies, the challenges they're facing, what they're excited about, and more. In addition, as part of this effort, I'm publishing a series of eBooks on this topic. The first of them takes a bottoms-up look at AI platforms and is focused on the open source Kubernetes platform which is used to deliver scalable ML and infrastructure at OpenAI, Booking.com, Matroid and many more companies. It'll be available soon on the TWIML web site, and will be followed shortly thereafter by the second book in the series which looks at scaling data science and ML engineering from the top down, exploring the internal platforms companies Facebook, Uber, and Google have built, the process disciplines that they embody, and what enterprises can learn from them. If this is a topic you're interested in, I'd encourage you to visit twimlai.com/aiplatforms and sign up to be notified as soon as these books are published. " data-search-guests="Bee-Chung Chen">

Cover Image: Bee-Chung Chen - Podcast Interview

200

Productive Machine Learning at LinkedIn

As many of you know, part of my work involves understanding the way large companies are adopting machine learning, deep learning and AI. While it's still fairly early in the game, we're at a really interesting time for many companies. With the first wave of ML projects at early adopter enterprises starting to mature, many of them are asking themselves how can they scale up their ML efforts to support more projects and teams. Part of the answer to successfully scaling ML is supporting data scientists and machine learning engineers with modern processes, tooling and platforms. Now, if you've been following me or the podcast for a while, you know that this is one of the topics I really like to geek out on. Well, I'm excited to announce that we'll be exploring this topic in depth here on the podcast over the next several weeks. You'll hear from folks building and supporting ML platforms at a host of different companies. We'll be digging deep into the technologies they're deploying to accelerate data science and ML development in their companies, the challenges they're facing, what they're excited about, and more. In addition, as part of this effort, I'm publishing a series of eBooks on this topic. The first of them takes a bottoms-up look at AI platforms and is focused on the open source Kubernetes platform which is used to deliver scalable ML and infrastructure at OpenAI, Booking.com, Matroid and many more companies. It'll be available soon on the TWIML web site, and will be followed shortly thereafter by the second book in the series which looks at scaling data science and ML engineering from the top down, exploring the internal platforms companies Facebook, Uber, and Google have built, the process disciplines that they embody, and what enterprises can learn from them. If this is a topic you're interested in, I'd encourage you to visit twimlai.com/aiplatforms and sign up to be notified as soon as these books are published. " data-search-guests="Christopher Berner">

Cover Image: Christopher Berner - Podcast Interview

199

Scaling Deep Learning on Kubernetes at OpenAI

As many of you know, part of my work involves understanding the way large companies are adopting machine learning, deep learning and AI. While it's still fairly early in the game, we're at a really interesting time for many companies. With the first wave of ML projects at early adopter enterprises starting to mature, many of them are asking themselves how can they scale up their ML efforts to support more projects and teams. Part of the answer to successfully scaling ML is supporting data scientists and machine learning engineers with modern processes, tooling and platforms. Now, if you've been following me or the podcast for a while, you know that this is one of the topics I really like to geek out on. Well, I'm excited to announce that we'll be exploring this topic in depth here on the podcast over the next several weeks. You'll hear from folks building and supporting ML platforms at a host of different companies. We'll be digging deep into the technologies they're deploying to accelerate data science and ML development in their companies, the challenges they're facing, what they're excited about, and more. In addition, as part of this effort, I'm publishing a series of eBooks on this topic. The first of them takes a bottoms-up look at AI platforms and is focused on the open source Kubernetes platform which is used to deliver scalable ML and infrastructure at OpenAI, Booking.com, Matroid and many more companies. It'll be available soon on the TWIML web site, and will be followed shortly thereafter by the second book in the series which looks at scaling data science and ML engineering from the top down, exploring the internal platforms companies Facebook, Uber, and Google have built, the process disciplines that they embody, and what enterprises can learn from them. If this is a topic you're interested in, I'd encourage you to visit twimlai.com/aiplatforms and sign up to be notified as soon as these books are published. " data-search-guests="Atul Kale">

Cover Image: Atul Kale - Podcast Interview

198

Bighead: Airbnb's Machine Learning Platform

As many of you know, part of my work involves understanding the way large companies are adopting machine learning, deep learning and AI. While it's still fairly early in the game, we're at a really interesting time for many companies. With the first wave of ML projects at early adopter enterprises starting to mature, many of them are asking themselves how can they scale up their ML efforts to support more projects and teams. Part of the answer to successfully scaling ML is supporting data scientists and machine learning engineers with modern processes, tooling and platforms. Now, if you've been following me or the podcast for a while, you know that this is one of the topics I really like to geek out on. Well, I'm excited to announce that we'll be exploring this topic in depth here on the podcast over the next several weeks. You'll hear from folks building and supporting ML platforms at a host of different companies. We'll be digging deep into the technologies they're deploying to accelerate data science and ML development in their companies, the challenges they're facing, what they're excited about, and more. In addition, as part of this effort, I'm publishing a series of eBooks on this topic. The first of them takes a bottoms-up look at AI platforms and is focused on the open source Kubernetes platform which is used to deliver scalable ML and infrastructure at OpenAI, Booking.com, Matroid and many more companies. It'll be available soon on the TWIML web site, and will be followed shortly thereafter by the second book in the series which looks at scaling data science and ML engineering from the top down, exploring the internal platforms companies Facebook, Uber, and Google have built, the process disciplines that they embody, and what enterprises can learn from them. If this is a topic you're interested in, I'd encourage you to visit twimlai.com/aiplatforms and sign up to be notified as soon as these books are published. " data-search-guests="Aditya Kalro">

Cover Image: Aditya Kalro - Podcast Interview

197

Facebook's FBLearner Platform

Cover Image: Nina Miolane - Podcast Interview

196

Geometric Statistics in Machine Learning with Geomstat

195

Milestones in Neural Natural Language Processing

Cover Image: Garrett Hoffman - Podcast Interview

194

Natural Language Processing at StockTwits

Cover Image: Vukosi Marivate - Podcast Interview

193

Advanced Reinforcement Learning & Data Science for Social Impact

Cover Image: Osonde Osoba - Podcast Interview

192

AI Ethics, Strategic Decisioning and Game Theory

Cover Image: Herman Kamper - Podcast Interview

191

Acoustic Word Embeddings for Low Resource Speech Processing

190

Learning Representations for Visual Search

189

Evaluating Model Explainability Methods

Cover Image: Zachary Hanif - Podcast Interview

188

Graph Analytic Systems

Cover Image: Ahsan Ashraf - Podcast Interview

187

Diversification in Recommender Systems

twimlai.com/meetup. " data-search-guests="Jeremy Howard">

186

The Fastai v1 Deep Learning Framework

Cover Image: Justin Norman - Podcast Interview

185

Federated ML for Edge Applications

Cover Image: Viviana Acquaviva - Podcast Interview

184

Exploring Dark Energy and Star Formation with ML

Cover Image: James Dreiss - Podcast Interview

183

Document Vectors in the Wild

Cover Image: Naveed Ahmad - Podcast Interview

182

Applied Machine Learning for Publishers

Cover Image: Nick Bostrom - Podcast Interview

181

Anticipating Superintelligence

Cover Image: Hanbyul Joo - Podcast Interview

180

Can We Train an AI to Understand Body Language?

Cover Image: Jay Newby - Podcast Interview

179

Biological Particle Identification and Tracking

Cover Image: Debajyoti Ray - Podcast Interview

178

AI for Content Creation

177

Deep Reinforcement Learning Primer and Research Frontiers

Cover Image: Christy Dennison - Podcast Interview

176

The OpenAI Five Benchmark

Cover Image: Pat Woowong - Podcast Interview

175

How ML Keeps Shelves Stocked at Home Depot

174

Contextual Modeling for Language and Vision

Cover Image: Kyle Story - Podcast Interview

173

ML for Understanding Satellite Imagery at Scale

172

Generating Ground-Level Images From Overhead Imagery Using GANs

Cover Image: Larry Matthies - Podcast Interview

171

Vision Systems for Planetary Landers & Drones

Cover Image: Ashutosh Saxena - Podcast Interview

170

Learning Semantically Meaningful and Actionable Representations

Cover Image: Joseph Connor - Podcast Interview

169

AI Innovation for Clinical Decision Support

Cover Image: Gillian McCann - Podcast Interview

167

Conversational AI for the Intelligent Workplace

Cover Image: Jason Holmberg - Podcast Interview

166

Computer Vision and Intelligent Agents for Wildlife Conservation

Cover Image: Laura Leal-Taixé - Podcast Interview

168

Dynamic Visual Localization and Segmentation

Cover Image: Prashant Warier - Podcast Interview

165

Pragmatic Deep Learning for Medical Imagery

164

Taskonomy: Disentangling Task Transfer Learning (CVPR 2018 Best Paper Winner)

Cover Image: Zak Costello - Podcast Interview

163

Predicting Metabolic Pathway Dynamics with Machine Learning

Cover Image: Nathan Kutz - Podcast Interview

162

Machine Learning to Discover Physics and Engineering Principles

Cover Image: Alexander Chukovski - Podcast Interview

161

Automating Complex Internal Processes with AI

Cover Image: Adji Bousso Dieng - Podcast Interview

160

Designing Better Sequence Models with RNNs

Cover Image: Stephanie Kovalchik - Podcast Interview

159

Love Love: AI and ML in Tennis

Cover Image: Noah Gift - Podcast Interview

158

Growth Hacking Sports with Machine Learning

Cover Image: Jennifer Hobbs - Podcast Interview

157

Fine-Grained Player Prediction in Sports

Cover Image: Chenhui Hu, Mike Schumacher - Podcast Interview

156

Targeted Ticket Sales Using Azure ML with the Trail Blazers

Cover Image: Sinead Flahive - Podcast Interview

155

AI for Athlete Optimization

Cover Image: Vince Jeffs - Podcast Interview

154

Omni-Channel Customer Experiences

Cover Image: Michael Kempe - Podcast Interview

153

Workforce Intelligence for Automation and Productivity

Cover Image: Jim Saleh - Podcast Interview

152

Data Platforms for Decision Automation at Scotiabank

Cover Image: Kirk Borne - Podcast Interview

151

Towards the Self-Driving Enterprise

Cover Image: Nick Osborn - Podcast Interview

150

How a Global Energy Company Adopts ML & AI

Cover Image: Romer Rosales - Podcast Interview

149

Problem Formulation for Machine Learning

Cover Image: Greg Mullholland - Podcast Interview

148

AI for Materials Discovery

Cover Image: Adam Wenchel - Podcast Interview

147

Data Innovation & AI at Capital One

Cover Image: Song Han - Podcast Interview

146

Deep Gradient Compression for Distributed Training

Cover Image: George Papamakarios - Podcast Interview

145

Masked Autoregressive Flow for Density Estimation

Cover Image: Qazaleh Mirsharif - Podcast Interview

144

Training Data for Computer Vision at Figure Eight

Cover Image: Sarah Aerni - Podcast Interview

143

Agile Data Science

142

Tensor Operations for Machine Learning

Cover Image: David Van Valen - Podcast Interview

141

Deep Learning for Live-Cell Imaging

Cover Image: Garry Kasparov - Podcast Interview

140

Checking in with the Master

Voices in My Head, and more. She also provides a rundown of the many tools she's used in this space, including Google Magenta, Watson Beat, AMPer, Landr and more. This was a super fun interview that I think you'll get a kick out of. " data-search-guests="Taryn Southern">

Cover Image: Taryn Southern - Podcast Interview

139

Exploring AI-Generated Music

Cover Image: Rachel Thomas - Podcast Interview

138

Practical Deep Learning

Cover Image: Jose Hernandez-Orallo - Podcast Interview

137

Kinds of intelligence: types, tests and meeting the needs of society

136

Taming arXiv with Natural Language Processing

Cover Image: Chang Liu - Podcast Interview

135

Epsilon Software for Private Machine Learning

Cover Image: Nicolas Papernot - Podcast Interview

134

Scalable Differential Privacy for Deep Learning

Cover Image: Zahi Karam - Podcast Interview

133

Differential Privacy at Bluecore

Cover Image: Aaron Roth - Podcast Interview

132

Differential Privacy Theory and Practice

Cover Image: Marco Cuturi - Podcast Interview

131

Optimal Transport and Machine Learning

Cover Image: Kiran Vajapey - Podcast Interview

130

Collecting and Annotating Data for AI

Cover Image: Christopher Lum - Podcast Interview

129

Autonomous Aerial Guidance, Navigation and Control Systems

Cover Image: Missy Cummings - Podcast Interview

128

Infrastructure for Autonomous Vehicles

127

Hyper-Personalizing the Customer Experience w/ AI

126

Information Extraction from Natural Document Formats

Cover Image: Robert Munro - Podcast Interview

125

Human-in-the-Loop AI for Emergency Response and More

Cover Image: Jeff Dean - Podcast Interview

124

Systems and Software for Machine Learning at Scale

Cover Image: Lyne Tchapmi - Podcast Interview

123

Semantic Segmentation of 3D Point Clouds

Cover Image: Ryan Poplin - Podcast Interview

122

Predicting Cardiovascular Risk Factors from Eye Images

Cover Image: Clare Gollnick - Podcast Interview

121

Reproducibility and the Philosophy of Data

Cover Image: GK Senthil - Podcast Interview

120

Surveying the Connected Car Landscape

Cover Image: Sandy Huang, Ian Goodfellow - Podcast Interview

119

Adversarial Attacks Against Reinforcement Learning Agents

Cover Image: Raja Chatila - Podcast Interview

118

Towards Abstract Robotic Understanding

Harvard astrophysics researcher, kicking off a collaboration and side project eventually leading to the discovery of two new planets outside our solar system. In our conversation, we walk through the entire process Chris followed to find these two exoplanets, including how he researched the domain as an outsider, how he sourced and processed his dataset, and how he built and evolved his models. Finally, we discuss the results of his project and his plans for future work in this area. This podcast is being published in parallel with Google's release of the source code and data that Chris developed and used, which we'll link to below, so if what you hear inspires you to dig into this area, you've got a nice head start. This was a really interesting conversation, and I'm excited to share it with you! " data-search-guests="Chris Shallue">

Cover Image: Chris Shallue - Podcast Interview

117

Discovering Exoplanets with Deep Learning

Cover Image: Ksenia Konyushkova - Podcast Interview

116

Learning Active Learning from Data

115

Scaling Machine Learning at Uber

Cover Image: Zenna Tavares - Podcast Interview

114

Inverse Programming for Deeper AI

Cover Image: Sriraam Natarajan - Podcast Interview

113

Statistical Relational Artificial Intelligence

Cover Image: Charles Onu - Podcast Interview

112

Classical Machine Learning for Infant Medical Diagnosis

111

Learning “Common Sense” and Physical Concepts

110

Trust in Human-Robot/AI Interactions

Cover Image: Nyalleng Moorosi - Podcast Interview

109

Data Science for Poaching Prevention and Disease Treatment

Cover Image: Moustapha Cissé - Podcast Interview

108

Security and Safety in AI: Adversarial Examples, Bias and Trust

Cover Image: Negar Ghourchian, Michel Allegue - Podcast Interview

107

Peering Into the Home with Aerial's Wifi Motion Analytics

Cover Image: Ilkka Korhonen - Podcast Interview

106

Physiology-Based Models for Fitness and Training with Firstbeat

Cover Image: Brady Tsai, Stuart Feffer - Podcast Interview

105

Machine Learning for Signal Processing Applications

Cover Image: Emile Chin-Dickey, Andy Keller - Podcast Interview

104

Personalizing the Ferrari Challenge Experience

Cover Image: Alex Teichman - Podcast Interview

103

Deep Learning for 3D Sensors and Cameras in Lighthouse

Cover Image: Andrew Stein - Podcast Interview

102

Computer Vision for Cozmo, the Cutest Toy Robot Eveeerrr!

Cover Image: Inmar Givoni - Podcast Interview

101

Expectation Maximization, Gaussian Mixtures & Belief Propagation, OH MY!

100

NeurIPS Best Paper: A Linear-Time Kernel Goodness-of-Fit Test

Cover Image: Tuomas Sandholm - Podcast Interview

99

NeurIPS Best Paper: Safe and Nested Subgame Solving for Imperfect-Information Games

Cover Image: Eric Humphrey - Podcast Interview

98

Separating Vocals in Recorded Music at Spotify

Cover Image: Greg Diamos - Podcast Interview

97

Accelerating Deep Learning with Mixed Precision Arithmetic

96

Composing Graphical Models With Neural Networks

Cover Image: Yael Niv - Podcast Interview

92

Learning State Representations

Cover Image: Matthew Crosby - Podcast Interview

91

Philosophy of Intelligence

Cover Image: Michael Bronstein, Joan Bruna - Podcast Interview

90

Geometric Deep Learning

Cover Image: Sara Jennings, Andres Rodriguez, Timothy Seabrook - Podcast Interview

89

AI at the NASA Frontier Development Lab

88

Using Deep Learning and Google Street View to Estimate Demographics

Cover Image: Aaron Ames - Podcast Interview

87

Integrative Learning for Robotic Systems

Cover Image: Chris Adzima - Podcast Interview

86

Visual Recognition in the Cloud for Law Enforcement

Cover Image: Kristen Grauman - Podcast Interview

85

Embodied Visual Learning

Cover Image: Nikita Shamgunov - Podcast Interview

84

Real-time Machine Learning in the Database

Cover Image: Lawrence Chung, Dave McCrory - Podcast Interview

83

Re:Invent Roundup Roundtable

Cover Image: Thierry Derungs - Podcast Interview

81

Innovation Factories for AI in Financial Services

Cover Image: Sherif Mityas - Podcast Interview

82

Driving Customer Loyalty with Predictive and Conversational AI

Cover Image: Durk Kingma - Podcast Interview

80

Block Sparse Kernels for Deep Neural Networks

Cover Image: Brian Gross - Podcast Interview

79

AI for Customer Service and Marketing at Aeromexico

Cover Image: Mazin GIlbert - Podcast Interview

78

Scaling AI for the enterprise

Cover Image: Hillery Hunter - Podcast Interview

77

Scalable Distributed Deep Learning

Cover Image: Jonas Schneider - Podcast Interview

76

Robotics at OpenAI

74

Towards Artificial General Intelligence

Cover Image: Dario Amodei - Podcast Interview

75

AI Robustness and Safety

Cover Image: Sam Ritchie - Podcast Interview

73

Explaining Black Box Predictions

Cover Image: Allison Parrish - Podcast Interview

72

Experimental Creative Writing with the Vectorized Word

Cover Image: Matthew Taylor - Podcast Interview

71

The Biological Path Towards Strong AI

Cover Image: Soumith Chintala - Podcast Interview

70

Pytorch: Fast Differentiable Dynamic Graphs in Python

Cover Image: Ryan Sevey, Jason Montgomery - Podcast Interview

69

Accessible Machine Learning for the Enterprise Developer

Cover Image: Ross Fadely - Podcast Interview

68

Bridging the Gap Between Academic and Industry Careers

Cover Image: Dennis Mortensen - Podcast Interview

67

The Limitations of Human in the Loop AI

Cover Image: Kul Singh - Podcast Interview

66

Nexus Lab Cohort 2: Second Mind

Cover Image: Davide Venturelli - Podcast Interview

93

A Quantum Computing Primer and Implications for AI

Bitesnap, which provides users with detailed nutritional information about the food they're about to eat using just a photo and a serving size. We dive into the details of their app and service, the machine learning models and pipeline that enable it, and how they plan to compete with other apps targeting dieters, and more! " data-search-guests="Vinay Anantharaman, Michal Wolski">

Cover Image: Vinay Anantharaman, Michal Wolski - Podcast Interview

65

Nexus Lab Cohort 2: Bite.ai

Cover Image: Vivek Sudarsan, Ron Fisher, Mike Wang - Podcast Interview

64

Nexus Lab Cohort 2: Bowtie

Cover Image: James VIllarrubia, Bernie Prat - Podcast Interview

63

Nexus Lab Cohort 2: Mt. Cleverest

Cover Image: Graham Taylor - Podcast Interview

62

Learning to Learn, and other Opportunities in Machine Learning

Cover Image: Kenneth Conroy - Podcast Interview

61

Building Conversational Application for Financial Services

Cover Image: Solmaz Shahalizadeh - Podcast Interview

60

Fighting Fraud with Machine Learning at Shopify

Cover Image: Katie Driggs-Campbell - Podcast Interview

59

Modeling Human Drivers for Autonomous Vehicles

Cover Image: Jianxiong Xiao - Podcast Interview

58

Perception Models for Self Driving Cars

Cover Image: Daryn Nakhuda - Podcast Interview

57

Training Data for Autonomous Vehicles

Cover Image: James Guzscza - Podcast Interview

56

Human Factors in Machine Intelligence

Tegra. This was a pretty interesting talk. Enjoy! " data-search-guests="Ion Stoica">

Cover Image: Ion Stoica - Podcast Interview

55

Ray: A Distributed Computing Platform for Reinforcement Learning

Cover Image: Mo Patel, Laura Frølich - Podcast Interview

54

ML Use Cases at Think Big Analytics

Cover Image: Gunnar Carlsson - Podcast Interview

53

Topological Data Analysis

Cover Image: Scott Apeland, Naveen Rao - Podcast Interview

51

Intel Nervana Devcloud

Cover Image: Paul Tepper - Podcast Interview

52

AI-Powered Conversational Interfaces

50

Bayesian Optimization for Hyperparameter Tuning

Cover Image: Jonathan Mugan - Podcast Interview

49

Symbolic and Sub Symbolic Natural Language Processing

Cover Image: Bruno Goncalves - Podcast Interview

48

Word2Vec & Friends

47

Evolutionary Algorithms in Machine Learning

Cover Image: Jennifer Prendki - Podcast Interview

46

Agile Machine Learning

44

LSTMs, Plus a Deep Learning History Lesson

Cover Image: Mark Hammond - Podcast Interview

43

Machine Teaching for Better Machine Learning

Josh was on the show around this time last year, just prior to the acquisition of his company Wise.io by GE Digital. It was great to catch up with Josh on his journey within GE, and the work his team is doing around Industrial AI, now that they're part of the one of the world's biggest industrial companies. We talk about some really interesting things in this show, including how his team is using autoencoders to create training datasets, and how they incorporate knowledge of physics and physical systems into their machine learning models. " data-search-guests="Joshua Bloom">

Cover Image: Joshua Bloom - Podcast Interview

42

Marrying Physics Based and Data Driven ML Models

Cover Image: Erin Shellman - Podcast Interview

41

Data Pipelines at Zymergen with Airflow

Cover Image: Drew Conway - Podcast Interview

39

Cognitive Biases in Data Science

Cover Image: Sharath Rao - Podcast Interview

40

Web Scale Engineering for Machine Learning

Cover Image: Calvin Seward - Podcast Interview

38

Deep Learning for Warehouse Operations

37

Deep Robotic Learning

Cover Image: Yodit Stanton - Podcast Interview

36

Smart Buildings & IoT

Cover Image: Rana el Kaliouby - Podcast Interview

35

Enhancing Customer Experiences With Emotional AI

Cover Image: Reza Zadeh - Podcast Interview

34

Video Object Detection At Scale

Cover Image: Ben Vigoda - Podcast Interview

33

The Power Of Probabilistic Programming

Performance RNN project, which uses neural networks to create expressive, AI-generated music. We also touch on QuickDraw, a project by Google AI Experiments, in which users as Doug describes it, "play Pictionary" with a visual classifier. We dig into what he foresees as possibilities for Magenta, machine learning models eventually developing storylines, generative models for media and creative coding. " data-search-guests="Douglas Eck">

Cover Image: Douglas Eck - Podcast Interview

32

Expressive AI - Generated Music With Google's Performance RNN

Neon, and Nervana Graph, a new toolset for expressing and running deep learning applications as framework and hardware-independent computational graphs. Nervana Graph in particular sounds like a very interesting project, not to mention a smart move for Intel, and I'd encourage folks to take a look at their Github repo. One of the things announced at the conference is that Intel and O'Reilly will be partnering on the AI Conference going forward, starting with the San Francisco event in September. To celebrate this, we're going to start our AI Conference ticket giveaway (courtesy of O'Reilly), and run it through August 4th. To enter, visit the giveaway page. " data-search-guests="Hanlin Tang, Naveen Rao">

Cover Image: Hanlin Tang, Naveen Rao - Podcast Interview

31

Intel Nervana Update + Productizing AI Research

AWS Chatbot Challenge, which you've still got a couple more weeks to participate in. " data-search-guests="Zornitsa Kozareva">

Cover Image: Zornitsa Kozareva - Podcast Interview

30

Natural Language Understanding for Amazon Alexa

Deep Visual Foresight , Model-Agnostic Meta-Learning and Visuomotor Learning to name a few. We discuss all of these topics in the show, as well as subjects like zero-shot, one-shot and few-shot learning.

Chelsea also gives some advice to all PhD students out there just like herself. We spend some time at the end of the interview talking about this, and she has some great words of wisdom for current and prospective PhD students but also independent learners in the field.

" data-search-guests="Chelsea Finn">

29

Robotic Perception and Control

28

Reinforcement Learning Deep Dive

Cover Image: Ilia Baranov - Podcast Interview

27

Intelligent Autonomous Robots

Thanks, Ankur for being a loyal listener, and we'll be in touch with you about that Google Home! And now our grand prize winner is Mason Grimshaw!! Mason also wrote in via the show notes page. He said:

Mason, We hope to see you in NY in a few weeks! Congratulations and thank you for being a loyal listener! Although there can only be one 1st prize winner, we'd like to give everyone the opportunity to attend the O'Reilly AI conference. With the code PCTWIML, all of our listeners will get 20% off of registration fees when purchasing passes for the conference. For registration information, visit the event page! Please let us know if you're planning to attend the event. We're looking forward to seeing some TWIML listeners at the conference! " data-search-guests="Ben Lorica">

Cover Image: Ben Lorica - Podcast Interview

26

Global AI Trends

Last week I announced the first anniversary of the podcast and I want to start today's show this week by thanking everyone who's participated in our anniversary contest so far. If you have not entered yet, click the button above or visit the Listener Appreciate Contest page for more details! We asked you to comment on the show notes page or post an iTunes review and wow, did you deliver. Your stories have been personal, thoughtful, and downright encouraging. Here are a couple of our favorites:

I remember when Bill's first review hit. The decision to switch to the interview format was very tough for me, but necessary for my efforts to be sustainable. Bill's initial 2 star review really hurt. I think he said something about the interviews being too fluffy, like the Tim Ferriss of AI. I really took that to heart and was pumped when Bill raised his review to 4 stars. And now I'm super excited to see that I've earned Bill's fifth star! Thanks, Bill!

That's a ton of ground to have covered in a year, Andrew! Congrats! We're so proud to have been a small part of your success! " data-search-guests="Deep Varma">

Cover Image: Deep Varma - Podcast Interview

25

Offensive vs Defensive Data Science

First prize in this contest will be a Bronze Pass to the upcoming O'Reilly AI conference in NYC, valued at $1800. I'll be there and I'd love to have you there as my guest.

Second prize is a brand new Google Home, powered by AI. You know you can listen to the podcast on these, right?

And every entrant wins a couple of stickers--one for you and one for a friend--and of course, our everlasting love and appreciation! So how do you enter? Well, it's easy. Leave us a review on iTunes or a comment here on the show notes page, telling us about your favorite TWIML experience: the things you've learned, your favorite moments on the show, etc. It's that easy! We'll select the winners on June 1st and announce the winners on the next show. (We'll have an alternate prize available for 1st place winners who know they can't attend the O'Reilly event--It will be of similar value as the 2nd place prize, but the details are TBD.) " data-search-guests="Danny Lange">

Cover Image: Danny Lange - Podcast Interview

24

Reinforcement Learning: The Next Frontier of Gaming

Cover Image: Dominique Davis - Podcast Interview

23

Integrating Psycholinguistics into AI

Cover Image: Matt Zeiler - Podcast Interview

22

Deep Neural Nets for Visual Recognition

Cover Image: Ruchir Puri - Podcast Interview

21

Engineering the Future of AI

20

Selling AI to the Enterprise

19

From Particle Physics to Audio AI

18

Increasing Efficiency of Healthcare Insurance Billing with NLP

Cover Image: Miika Perä - Podcast Interview

18

Using AI to Simplify the Programming of Robots

Cover Image: Pablo Martinez Flores, Karlo Valentin Rodriguez Rangel - Podcast Interview

18

Location Based Intelligence for Smarter Marketing

Cover Image: James Fan, Liangliang Cao - Podcast Interview

18

AI Powered Customer Support

Cover Image: Mutisya Ndunda - Podcast Interview

18

Creating a Worldwide Financial Knowledge Graph

Cover Image: Alekh Agarwal - Podcast Interview

17

Interactive Machine Learning Systems

If Evan's name sounds familiar, it's because Evan was the winner of the O'Reilly Strata+Hadoop World ticket giveaway earlier this month. We met up at the conference last week and took advantage of the opportunity to record this show. Our conversation covers, among other topics, the three big problems in cybersecurity that ML can help out with, the challenges of acquiring ground truth in cybersecurity and some ways to accomplish it, and the use of decision trees, generative adversarial networks, and other algorithms in the field. " data-search-guests="Evan Wright">

Cover Image: Evan Wright - Podcast Interview

16

Machine Learning in Cybersecurity

15

Domain Knowledge in Machine Learning Models for Sustainability

Cover Image: Shubho Sengupta - Podcast Interview

14

Scaling Deep Learning: Systems Challenges & More

Cover Image: James McCaffrey - Podcast Interview

13

Understanding Deep Neural Networks

Cover Image: Brendan Frey - Podcast Interview

12

Reprogramming the Human Genome

Cover Image: Hilary Mason - Podcast Interview

11

Building AI Products

95

Embedded Deep Learning at Deep Vision

Geometric Intelligence , the company he co-founded with Gary Marcus and others, was acquired in late 2016. Kenneth's research focus is what he calls Neuroevolution, applies the idea of genetic algorithms to the challenge of evolving neural network architectures. In this conversation, we discuss the Neuroevolution of Augmenting Topologies (or NEAT) paper that Kenneth authored along with Risto, which won the 2017 International Society for Artificial Life's Award for Outstanding Paper of the Decade 2002 - 2012. We also cover some of the extensions to that approach he's created since, including, HyperNEAT, which can efficiently evolve very large networks with connectivity patterns that look more like those of the human and that are generally much larger than what prior approaches to neural learning could produce, and novelty search, an approach which unlike most evolutionary algorithms has no defined objective, but rather simply searches for novel behaviors. We also cover concepts like "Complexification" and "Deception", biology vs computation including differences and similarities, and some of his other work including his book, and NERO, a video game complete with Real-time Neuroevolution. This is a meaty "Nerd Alert" interview that I think you'll really enjoy. " data-search-guests="Kenneth Stanley">

Cover Image: Kenneth Stanley - Podcast Interview

94

Neuroevolution: Evolving Novel Neural Network Architectures

10

Statistics vs Semantics for Natural Language Processing

Cover Image: Pascale Fung - Podcast Interview

9

Emotional AI: Teaching Computers Empathy

Cover Image: Diogo Almeida - Podcast Interview

8

Deep Learning: Modular in Theory, Inflexible in Practice

Marco Ribeiro, and Sameer Singh, then a postdoc at UW, published some very interesting research into explaining the predictions of machine learning algorithms. Their paper, titled, "Why Should I Trust You?: Explaining the Predictions of Any Classifier," has been on my reading list for a while, and discussing this work was the main focus of my conversation with Carlos. " data-search-guests="Carlos Guestrin">

Cover Image: Carlos Guestrin - Podcast Interview

7

Explaining the Predictions of Machine Learning Models

Spare5 (now Mighty AI). Spare5 helps customers generate the high-quality labeled training datasets that are so crucial to accurate machine learning models.

Angie and I talk through the challenges faced by folks that need to label training data, and how to develop a cohesive system for achieving performing the various labeling tasks you're likely to encounter. We discuss some of the ways that bias can creep into your training data and how to avoid that. We explore the some of the popular 3rd party options that companies look at for scaling training data production, and how they differ. And, Angie gives us her top 3 tips for folks tasked with generating training data for AI. " data-search-guests="Angie Hugeback">

Cover Image: Angie Hugeback - Podcast Interview

6

Generating Labeled Training Data for Your ML/AI Models

Wise.io. Josh and I had a wonderful discussion and, as you might have guessed if you happen to have noticed the length of this episode, we covered quite a lot of ground. But I promise you'll find this 84-minute interview to be jam-packed with great information, ideas and war stories. You'll learn how Josh and his research group at Berkeley pioneered the use of machine learning for the analysis of images from robotic infrared telescopes; we talk extensively about the challenges they faced in doing this and some of the results they achieved. We discuss the founding of his company, Wise.io, which uses machine learning to help customers deliver better customer support. That wasn't where the company started though, and you'll hear why and how they evolved to serve this market. We talk about his company's technology stack and data science pipeline in fair detail, and discuss some of the key technology decisions they've made in building their product. We also discuss some interesting open research challenges in machine learning and AI. " data-search-guests="Joshua Bloom">