Introduction
You can feel the field turning. For years, 3D protein structure prediction looked like magic that lived behind glass. Then an open door appeared. OpenFold3 takes ideas that changed biology and puts them in your hands, with an Apache 2.0 license, practical tooling, and a path from first demo to production. If you work anywhere near biomolecular modeling or computational drug discovery, this is the moment to get curious and ship your first result.
Table of Contents
1. Why This Matters Now
Science moves fastest when the best tools are not gated. OpenFold3 lowers the cost of experimentation, and it does it in two ways. First, the model is fully open, which means you can study it, adapt it, and use it in industry. Second, the deployment path is simple. You can run a prediction in a browser on NVIDIA NIM, then scale the same workflow into your own cluster. That combination, open code plus simple deployment, is what makes this an AlphaFold3 alternative that you can actually use.
This openness shifts timelines. Teams in pharma, biotech, and academia can run structure predictions for proteins, nucleic acids, and drug-like ligands without negotiating access. That helps more ideas survive first contact with data. It also lets small groups test risky hunches in days. The result is more shots on goal, and better ones.
2. What Is OpenFold3? The Open Source Foundation Model For Biology
OpenFold3 is a foundation model for biomolecular structure prediction. Give it a set of sequences, protein, DNA, RNA, and optional ligands, and it predicts the 3D structure of the full complex. The architecture builds on modern diffusion-style inference with attention over multiple modalities. It handles monomers, homomers, multimers, nucleic-acid hybrids, and protein-ligand assemblies. Out of the box, it reads multiple sequence alignments if you have them, and it can also run in MSA-light or no-MSA modes when needed.
Because it is a foundation model, you can fine-tune it for the corner of biology you care about. That might be enzyme design, RNA targeting, or better ranking inside a protein structure prediction software pipeline. In each case the point is the same. Start with a strong prior, then specialize it with your data.
3. Main Features At A Glance
The table below summarizes what you get on day one and how it maps to common lab needs.
OpenFold3 Capabilities And Status
| Capability | What It Means In Practice | Status | Where It Helps |
|---|---|---|---|
| Protein structures (monomers and complexes) | Predict structures for single proteins, homomers, and heteromeric assemblies | Mature | Core protein science, stability and interface analysis |
| DNA and RNA | Include nucleic acid chains alongside proteins | DNA: Mature RNA: Strong | RNP biology, regulatory complexes |
| Small molecule ligands | Co-fold protein-ligand complexes from SMILES or CCD codes | Preview | Early binding hypotheses, pocket shaping |
| Templates and MSAs | Use ColabFold or precomputed alignments, with template support | Mature | Higher accuracy when evolutionary signal exists |
| Confidence metrics | pLDDT, PAE, ipTM, ranking by model | Mature | Triage results, compare designs |
| Scalable inference | NVIDIA NIM containers and local CLI | Mature | From laptop tests to cluster-scale screens |
| Open license | Apache 2.0 for code and training recipes | Mature | Commercial R&D, reproducible science |
OpenFold3 delivers 3d protein structure prediction that respects the real world. You can start simple, and you can push it hard.
4. Your First Prediction In The NVIDIA NIM Playground

You do not need a workstation to get a feel for OpenFold3. The fastest path is the hosted NVIDIA NIM experience.
- Go to build.nvidia.com/openfold/openfold3.
- Pick “Protein-DNA Complex,” “Protein-Ligand Complex,” or start from a blank slate.
- Paste your sequences. For a protein-DNA test, use a protein chain in the PROTEIN box, then two short DNA chains. You can also attach MSA files if you have them.
- Click Run. The model runs on accelerated infrastructure.
- Inspect the Preview. You will see a 3D structure and a prediction score like pLDDT.
- Download results in mmCIF for your modeling stack, or PDB if you enable it downstream.
That is all you need for a first pass on protein structure prediction. From here you can move into examples that speak to real decisions in drug discovery.
4.1 Protein-DNA Complex Example
Let the interface guide the setup. Select “Protein-DNA Complex.” Paste a protein sequence into the protein field. Add two DNA chains, one per field. If you have an MSA for the protein, add it as a main or paired alignment. Run the job. In a minute you will see a co-folded complex. Rotate the model. Check the grooves, the contacts, and the average pLDDT. Export the mmCIF and bring it into your analysis notebook.
4.2 Protein-Ligand Complex Example
Drug work lives and dies on binding. Select “Protein-Ligand Complex.” Add a protein sequence. For the ligand, use a CCD code like ATP or a SMILES string. Run the prediction. Inspect the pocket. Look for agreement between the predicted pose and your prior knowledge. If you are exploring analogs, rerun with variations, then sort the outputs by ipTM and pocket geometry. This is not a docking engine, it is a generative structural prior, and it is a powerful way to prune the search space.
5. Interpreting Outputs With Confidence

A great picture can still mislead. With OpenFold3, you can judge quality with multiple signals.
- pLDDT estimates per-residue confidence. Higher is better. Average it for a quick sanity check.
- PAE tells you how confident the model is about inter-residue distances. Use low PAE between chains as a proxy for a stable interface.
- ipTM and pTM summarize interface confidence and overall fold. When you run multiple diffusion samples, sort by these scores, then inspect the top few.
- Clash flags and disorder hints help you filter junk fast.
Good habits make this reliable. Always save JSON confidence files along with the structures. Plot pLDDT across residues to spot weak regions. Compare samples rather than trusting a single view. When the metrics fight the picture, trust the metrics first, then dig.
6. Local Install And Power User Workflows
Browser demos are great. Real work needs reproducibility and control. You can install the package on a Linux box with an Ampere or Hopper GPU, then script the full workflow. Here is a minimal path with OpenFold3.
# 1) Install
pip install openfold3
mamba install kalign2 -c bioconda
# 2) One-time setup, downloads model parameters
setup_openfold
# 3) Run a single prediction with ColabFold MSA
run_openfold predict \
--query_json examples/example_inference_inputs/query_ubiquitin.json \
--output_dir out/Queries live in a simple JSON. You define chains, types, and sequences. You can point to precomputed MSAs when you have them, or ask the pipeline to fetch alignments from ColabFold for protein chains. Outputs include mmCIF or PDB files, confidence JSON, and timing.
6.1 How To Use OpenFold3 With Precomputed MSAs
High throughput work often starts with prepared alignments. Create a query JSON that references your main and paired MSA files. Disable the MSA server in the command. Keep a stable output directory so the pipeline can reuse processed features across runs.
run_openfold predict \
--query_json /path/to/query_precomputed.json \
--use_msa_server False \
--output_dir /path/to/output/ \
--runner_yaml /path/to/inference_precomputed.ymlThis mode saves network calls and gives you predictable runtimes for large screens in protein structure prediction.
6.2 Running On Modest GPUs
You can push larger complexes on smaller cards by switching on low memory presets and reducing the number of diffusion samples. Use a runner YAML to apply presets, and keep the PAE head enabled if you need ranking.
runner.yml# runner.yml
model_update:
presets:
- predict
- low_mem
- pae_enabled
pl_trainer_args:
devices: 1
num_nodes: 1
experiment_settings:
seeds: [100]Start with one seed and two or three samples. Scale up only when the first pass looks promising.
7. Where OpenFold3 Fits In Real Projects

Three patterns show up again and again.
- Hit triage for computational drug discovery. Use the model to generate structures for target plus ligand variants. Rank by ipTM and interface PAE, then pass a handful of poses to physics-based refinement. This is a fast filter that plays well with docking. It is classic AI in drug discovery, and it saves cycles.
- Design loops for enzymes and binders. Inverse folding tools propose sequences. It gives you a structural check on each candidate. Keep the ones that land the right pocket geometry and secondary structure. Feed lessons back into the generator.
- Mechanism sketches with nucleic acids. Protein-RNA and protein-DNA models inform hypotheses you can test at the bench. Even when accuracy is imperfect, the qualitative picture can clarify which mutations matter most.
Many labs will keep using traditional protein structure prediction software for known families. That is fine. The point here is to expand the reachable set, especially for complexes and ligand interactions. When you need an AlphaFold3 alternative that you can tune and ship, this is the obvious choice.
8. How To Use OpenFold3 In Production With NVIDIA NIM
NVIDIA NIM turns models into web-friendly services, packaged as optimized containers with stable APIs. That matters when you want the same code to run on a laptop, in a private cluster, or behind an internal portal. Deploy the NIM service, point your app at the API, and you can serve predictions to colleagues without asking them to learn command line tools.
A small team can start with the hosted OpenFold3 “Experience” page, then graduate to a container on their own GPUs. Logs, metrics, and reproducible images make audits and handoffs easier. If your org already runs other NIM services, this becomes one more brick in the wall. The value is not only speed, it is consistency across research groups and projects.
This is where questions like how to use OpenFold3 move from tutorials into operations. You embrace the same patterns you use elsewhere. Build small clients, keep input schemas tight, and version your runner YAML. Pair the service with lightweight notebooks for analysis. When your needs grow, add a queue, run batched jobs, and store confidence JSON in an object store alongside structures.
9. Step-By-Step: How To Use OpenFold3 On NVIDIA Build
This quick guide mirrors the hosted flow and adds a few pro tips.
- Open the model page. Navigate to build.nvidia.com/openfold/openfold3.
- Pick an example. Choose “Protein-Ligand Complex” for drug work or “Protein-DNA Complex” for regulatory questions.
- Enter inputs. Paste protein sequences. Add DNA or RNA chains as needed. For ligands, use a CCD code like ATP, or a SMILES string. Attach an MSA file if you have it.
- Tune advanced options. Set the number of diffusion samples. Leave the rest at defaults for your first run.
- Run and wait. Jobs execute on NVIDIA GPUs. You will see status and a 3D preview on completion.
- Read the scores. Note average pLDDT and interface metrics. Keep the JSON with the mmCIF.
- Export. Download the structure, then move into your modeling tools. That might be PyMOL, MD engines, or custom notebooks.
- Iterate. Change the ligand, tweak the sequence, adjust seeds, and compare. Treat the model like a smart prior. It guides the search rather than dictating it.
If you want to automate the same flow, the API reference on the page shows request formats that match the buttons you just clicked. That symmetry keeps your prototype honest when you promote it into a service.
10. Practical Tips That Save Time With OpenFold3
- Keep a small library of query JSON templates for the kinds of complexes you run most.
- Cache MSAs by sequence hash, and reuse them across experiments.
- Store structures and confidence JSON together. Scripts that read one should read the other.
- When scores disagree with intuition, generate a few more samples rather than overfitting to a single output.
- Use simple plots, residue vs pLDDT, chain-pair PAE heat maps, to make decisions visible to the team.
11. Frequently Needed Commands
Here are a few one liners you will use often while learning how to use OpenFold3 locally.
# Make a new project folder
mkdir -p of3/projects/demo && cd $_
# Copy and edit a starter query
cp ~/.cache/openfold3/examples/example_inference_inputs/query_ubiquitin.json ./query.json
nano query.json
# Run with two seeds and three samples each
run_openfold predict \
--query_json ./query.json \
--num_model_seeds 2 \
--num_diffusion_samples 3 \
--output_dir ./out/
# Convert CIF to PDB after the run, if you prefer PDB
python - <<'PY'
import glob, gemmi as g, os
for path in glob.glob('out/**/**/*.cif', recursive=True):
st = g.read_structure(path)
pdb = os.path.splitext(path)[0] + '.pdb'
st.write_minimal_pdb(pdb)
print('Done')
PY12. A Compact Feature Matrix
If you need a single sheet to brief a colleague, this table will do.
OpenFold3 Quick Reference
| Area | Detail |
|---|---|
| Model Class | Foundation model for protein structure prediction and biomolecular modeling |
| Modalities | Proteins, DNA, RNA, small molecule ligands |
| Outputs | mmCIF or PDB, pLDDT, PAE, ipTM, pTM, ranking JSON |
| Workflows | NVIDIA NIM “Experience,” REST API, local CLI |
| Tuning | Seeds, diffusion samples, templates, MSA modes |
| Hardware | Ampere or Hopper GPUs, Linux |
| License | Apache 2.0 |
| Best For | Computational drug discovery, complex assemblies, research prototyping |
13. Closing: Build Something Real This Week
We are past the phase where structure prediction is a spectator sport. The combination of OpenFold3, a permissive license, and NVIDIA NIM makes it easy to try ideas and keep the promising ones. Start with the hosted page and a simple complex. Pull the files into your notebook. Run a small sweep, and let the confidence metrics help you think. Then decide where to plug the results into your pipeline.
If this inspired you, pick a target and run a first model today. Share one structure with a colleague and ask a pointed question about function. That is how progress feels. Then widen the loop. As you grow, keep notes on what works, and teach the next person how to use OpenFold3 with your examples. You will move faster, and the science will too.
1) What is OpenFold3 and why is it a big deal for scientific research?
OpenFold3 is a fully open-source model that predicts 3D structures of biomolecular complexes, and it’s released under Apache 2.0 so labs and companies can use and adapt it for commercial R&D. That combination of capability and license makes state-of-the-art structure prediction broadly accessible.
2) What is the easiest way for a researcher to try OpenFold3 right now?
Use the NVIDIA NIM OpenFold3 page, choose an example like Protein-Ligand or Protein-DNA, paste sequences or a ligand name or SMILES, click Run, then download the mmCIF and confidence scores for analysis. No local setup is required.
3) How does OpenFold3 compare to DeepMind’s AlphaFold3?
OpenFold3 is a PyTorch re-implementation that targets performance parity on core tasks, while offering an open license for commercial use and a simple path to deploy through NIM containers.
4) What types of molecules can OpenFold3 predict?
It models proteins, DNA, RNA, and small-molecule ligands, and can co-fold full complexes so you can study interfaces and binding poses in one run.
5) Can I use OpenFold3 for my company’s drug discovery pipeline?
Yes. The Apache 2.0 licensing permits commercial use, and organizations are already integrating OpenFold3 into discovery workflows. You can deploy it quickly via NVIDIA NIM.
