Paul Rigor, PhD
Richland, Washington, United States
3K followers
500+ connections
View mutual connections with Paul
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
View mutual connections with Paul
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
View Paul’s full profile
-
See who you know in common
-
Get introduced
-
Contact Paul directly
Other similar profiles
Explore more posts
-
Yann LeCun
Advanced Machine Intelligence… • 1M followers
An excellent piece in Newsweek by former Bell Labs President Marcus Weldon on "the 8 principles for the future of AI." In a remarkably clear and concise manner, the piece distills some basic predictions from a series of interviews with roboticist Rodney Brooks, neuroscientist David Eagleman, and myself. https://lnkd.in/dsJSSvGM
859
49 Comments -
Andreas Maier
Friedrich-Alexander-Universitä… • 7K followers
AI on Review: How Large Language Models Are Reshaping Peer Review The Peer Review Crunch and New "Reviewer Duties" Peer review is the backbone of scientific quality control, ensuring that research findings are vetted for accuracy and significance before publication (pmc.ncbi.nlm.nih.gov). In fast-moving fields like machine learning and computer vision, top conferences function much like journals - and the integrity of science depends on rigorous peer evaluation. However, the system is straining under an avalanche of submissions. Major AI conferences now routinely receive well over 10,000 papers, a surge that has stretched the reviewer pool to its limits (arxiv.org). This deluge has led to radical policy changes: some conferences now essentially conscript all submitting authors into service as reviewers. For example, ICLR 2025 explicitly warned authors that any paper without at least one author signed up to review would be desk-rejected (reddit.com). NeurIPS and others have similarly pleaded that "all authors help with reviewing, if asked," to tackle the reviewer shortage. https://lnkd.in/d7KDP99K
26
-
Elle O'Brien
Michigan Institute for Data… • 4K followers
Announcing a new pre-print, "More code, less validation: Risk factors for over-reliance on AI coding tools among scientists" with Alexis Parker, Nasir Eisty and Jeffrey Carver. In a survey of >800 scientists across domains and career stages who program, we find: - ChatGPT is king for programming support, as only a minority of scientific coders are using dedicated developer tools. In general, adoption is highest among student researchers and more junior programmers. - Perceived productivity gains are highest for scientists who are a) less experienced programmers, b) have relatively low use of software development practices like version control, code review, and testing, and c) report accepting more lines of generated code in their interactions with AI tools. - Non-adopters cite concerns about preventing their own learning and ethical objections. Now my editorializing: it is concerning to me the scientists who likely have the least support to get critical feedback about their code (whether through automated tests or review by a lab-mate) and the least experience are most likely to rate themselves as most productive with AI tools. Some of these reported productivity gains could be very real- and at the same time, it could be getting easier than ever to publish scientific code with subtle but serious semantic errors, because of lack of opportunities for feedback beyond runtime errors (which LLM-based tools are pretty good at resolving, anyway). I also wonder if we should be encouraging scientists who are currently programming with ChatGPT towards trying tools that are integrated into developer environments, which might blunt some of the most serious sycophancy-type issues associated with chat and better facilitate "good enough" development practices. Developer tools that use sub-agents also relieve some meta-cognitive burden of tracking context rot, which I suspect is a lot for junior coders. What do you think? Link: https://lnkd.in/gqDVFKMB
58
-
Colby T. Ford
Tuple, The Cloud Genomics… • 11K followers
Working on some super-secret protein project, but you don't want to share your input sequences to a public multiple sequence alignment (MSA) service? In today's post, I show how to host your own MSA API endpoint (like a private version of api.colabfold.com) in Azure AI Foundry. Then, you can fold in private using models like Boltz-2 or other folding tools that require MSAs. Give it a read: https://lnkd.in/eCuMaWdf #proteinfolding #msa #sequencealignment #boltz #alphafold #bioinformatics #genomics #computationalbiology #azure #ai #drugdesign #azureaifoundry
48
1 Comment -
Syed R Ali
KAUST • 13K followers
I have seen outstanding postdoc candidates miss opportunities — not because they lacked skill, but because information was scattered across emails, PDFs, and portals. When applicants understand expectations early, decisions become faster and fairer for everyone involved. Clarity isn’t optional in hiring. It’s foundational.
50
1 Comment -
Mehdi Jenab
OTI Lumionics • 1K followers
For many data scientists and engineers, Quantum Machine Learning has felt like a fortress guarded by complex physics, but a new publication has effectively lowered the drawbridge. A recently published 260-page paper titled "Quantum Machine Learning: A Hands-on Tutorial" is changing that dynamic by translating quantum concepts directly into the language of computer science and linear algebra . You can access the full paper here: https://lnkd.in/dhX_Me9m What makes this resource particularly valuable is that it does not ask you to become a physicist to understand the material. Instead of focusing on the behavior of particles, it meets machine learning practitioners where they are by systematically rebuilding familiar concepts—like kernels, neural networks, and transformers—using quantum circuits . It frames the quantum processor as a new type of computational tool designed to solve classical data problems, rather than just a physics experiment. The tutorial prioritizes practical application over abstract theory. It moves from the very basics of qubits to advanced modern architectures, providing code implementations to ensure the concepts are actionable . If you are looking to expand your toolkit and understand how quantum hardware might eventually optimize the AI models we use today, this is the foundational guide you have been waiting for.
553
7 Comments -
Fan Li
Apex 974 • 9K followers
New research reveals that throwing more compute at molecular LLMs (Mol-LLMs) might not solve drug discovery’s real problem. In a new publication, researchers introduced #NovoMolGen, a family of transformer-based Mol-LLMs trained on 1.5 billion molecules from ZINC-22. This work represents one of the largest systematic studies to date: 30K+ experiments by varying model sizes (32M–300M), molecular representations (SMILES, SELFIES, SAFE, DeepSMILES), and tokenization strategies (atomwise vs. BPE). So which factors truly drive performance in generating novel drug-like molecules, or not? 🔹Early Saturation: Unlike LLMs, where scale usually helps, Mol-LLMs plateau early. Even the smallest model (32M parameters) outperformed strong baselines and increasing model size yielded only marginal gains. 🔹Data Quality: ZINC-22’s 37B+ molecules are optimized for synthesizability, not biological function. As a result, models may have learned chemical syntax while missing the deeper semantics of molecular activity. 🔹Representation Matters Less: SMILES, SELFIES, SAFE, DeepSMILES… all performed similarly. Encoding choices help at the margins, but don’t solve the core challenge. What are the implications? 🔹Does early saturation suggest that, unlike biological sequences, there’s simply less to learn from chemical structures? 🔹Could representations like GNNs capture relationships that SMILES and SELFIES miss? 🔹And might hybrid approaches, combining ZINC-22 with smaller but biologically richer datasets, unlock more functional models? I’d love to hear your thoughts. 📄 NovoMolGen: Rethinking Molecular Language Model Pretraining, arxiv, August 19, 2025 🔗 https://lnkd.in/ezp8HYAN
268
13 Comments -
Evan Peikon
NNOXX Inc. • 8K followers
Co-expression network analysis is a powerful approach for identifying groups of genes, referred to as "modules," that exhibit coordinated expression patterns across a dataset. These modules often represent functionally related genes, or sets of genes, that work together in shared biological pathways or cellular processes. By capturing patterns of gene activity, co-expression networks enable researchers to infer relationships between gene expression and biological traits or conditions. This makes them invaluable tools for uncovering the mechanisms driving disease progression, developmental processes, or responses to environmental changes. One of the most widely adopted tools for constructing co-expression networks is Weighted Gene Co-expression Network Analysis (WGCNA). This method leverages pairwise correlations between gene expression levels to group genes into clusters, or modules, of tightly interconnected genes. Additionally, WGCNA highlights key genes within each module—known as hub genes—that are highly connected and often play pivotal regulatory roles. The latest article on Decoding Biology will introduce you to the principles of co-expression network analysis, walk you through constructing a co-expression network, and demonstrate how to identify key modules and hub genes associated with specific biological traits. PS - You can also read this article on Github with the following link: https://lnkd.in/eCJY9TZ2 #compbio #bioinformatics #genomics #biotech #systemsbiology #datascience
63
-
James Brand
Microsoft • 2K followers
Wrote a second blog post! This time it's about using LLMs for imputing missing data. I've seen a few papers about imputation with LLMs, but most seem to either train custom imputation transformers or horserace flagship models against standard methods. Mert Demirer and I recently played with an alternative idea which instead uses LLMs as part of an ensemble imputation approach, relying on their "knowledge" of the world to provide additional prediction signal. I haven't seen a paper about it yet, so we thought it'd be fun to make a short post: https://lnkd.in/gMzC6iaP
232
8 Comments -
Pavan Ramkumar
Various • 2K followers
I've been building SLAF, a ground-up rethink of storage and compute for single cell datasets in the virtual cell era [1]. As I've written before, SLAF dataloaders are ~100x faster than status quo h5ad-based data loaders[2] . While I've written about scale, throughput and streaming architecture before, I've glossed over an important piece: GPUs need to see randomized batches, and randomization need to happen on the fly. In version 0.3.0, we introduce a novel approach called Mixture of Scanners to address this challenge. Mixture of Scanners achieves ~90% of theoretical maximum entropy without sacrificing throughput. What does this have to do with Ignatius O'Reilly from A Confederacy of Dunces at the Library of Congress? New blog post: https://lnkd.in/geAKUXND Try it: pip install "slafdb[ml]>=0.3.0"
40
1 Comment
Explore top content on LinkedIn
Find curated posts and insights for relevant topics all in one place.
View top content