Models AI Projects
Daily ranking page for Models open-source AI repositories.
Models tracks 1449 repositories with 5392909 total GitHub stars.
- supertone-inc/supertonic - Lightning-Fast, On-Device, Multilingual TTS — running natively via ONNX. (4877 stars, Swift, Models)
- cactus-compute/needle - 26m function call model that runs on incredibly small devices (1537 stars, Python, Models)
- shiyu-coder/Kronos - Kronos: A Foundation Model for the Language of Financial Markets (24551 stars, Python, Models)
- kyegomez/OpenMythos - A theoretical reconstruction of the Claude Mythos architecture, built from first principles using the available research literature. (12750 stars, Python, Models)
- OpenBMB/VoxCPM - VoxCPM2: Tokenizer-Free TTS for Multilingual Speech Generation, Creative Voice Design, and True-to-Life Cloning (18765 stars, Python, Models)
- OpenBMB/MiniCPM-V - A Pocket-Sized MLLM for Ultra-Efficient Image and Video Understanding on Your Phone (24841 stars, Python, Models)
- jingyaogong/minimind-o - 🎙️ 「大模型」从0训练0.1B能听能说能看的全模态Omni模型!A 0.1B Omni model trained from scratch, capable of listening, speaking, and seeing! (1197 stars, Python, Models)
- openai/whisper - Robust Speech Recognition via Large-Scale Weak Supervision (99478 stars, Python, Models)
- AI4Finance-Foundation/FinGPT - FinGPT: Open-Source Financial Large Language Models! Revolutionize 🔥 We release the trained model on HuggingFace. (20108 stars, Jupyter Notebook, Models)
- ultralytics/ultralytics - Ultralytics YOLO 🚀 (57121 stars, Python, Models)
- OpenMOSS/MOSS-TTS-Nano - MOSS-TTS-Nano is an open-source multilingual tiny speech generation model from MOSI.AI and the OpenMOSS team. With only 0.1B parameters, it is designed... (2957 stars, Python, Models)
- google-research/timesfm - TimesFM (Time Series Foundation Model) is a pretrained time-series foundation model developed by Google Research for time-series forecasting. (19770 stars, Python, Models)
- k2-fsa/sherpa-onnx - Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Intern... (12230 stars, C++, Models)
- PriorLabs/TabPFN - ⚡ TabPFN: Foundation Model for Tabular Data ⚡ (7043 stars, Python, Models)
- QwenLM/Qwen3-TTS - Qwen3-TTS is an open-source series of TTS models developed by the Qwen team at Alibaba Cloud, supporting stable, expressive, and streaming speech genera... (11354 stars, Python, Models)
- index-tts/index-tts - An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System (20502 stars, Python, Models)
- huggingface/pytorch-image-models - The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeX... (36804 stars, Python, Models)
- lucas-maes/le-wm - Official code base for LeWorldModel: Stable End-to-End Joint-Embedding Predictive Architecture from Pixels (3309 stars, Python, Models)
- fishaudio/fish-speech - SOTA Open Source TTS (30315 stars, Python, Models)
- FunAudioLLM/CosyVoice - Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability. (21026 stars, Python, Models)
- deepseek-ai/DeepSeek-V3 - (103507 stars, Python, Models)
- xinntao/Real-ESRGAN - Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration. (35416 stars, Python, Models)
- Lightricks/LTX-2 - Official Python inference and LoRA trainer package for the LTX-2 audio–video generative model. (6644 stars, Python, Models)
- facebookresearch/tribev2 - This repository contains the code to train and evaluate TRIBE v2, a multimodal model for brain response prediction (2588 stars, Jupyter Notebook, Models)
- modelscope/FunASR - A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Tex... (16063 stars, Python, Models)
- Wan-Video/Wan2.2 - Wan: Open and Advanced Large-Scale Video Generative Models (15722 stars, Python, Models)
- DepthAnything/Depth-Anything-V2 - [NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation (8116 stars, Python, Models)
- facebookresearch/sam3 - The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading the trained mode... (9516 stars, Python, Models)
- Tencent-Hunyuan/OmniWeaving - Official Implementation of OmniWeaving: Towards Unified Video Generation with Free-form Composition and Reasoning (871 stars, Python, Models)
- deepseek-ai/DeepSeek-Coder - DeepSeek Coder: Let the Code Write Itself (23339 stars, Python, Models)
- ByteDance-Seed/Depth-Anything-3 - Depth Anything 3 (5264 stars, Python, Models)
- NVlabs/GR00T-WholeBodyControl - Welcome to GR00T Whole-Body Control (WBC)! This is a unified platform for developing and deploying advanced humanoid controllers. This includes: Decoupl... (1988 stars, Python, Models)
- gangweix/pixel-perfect-depth - [NeurIPS 2025] Pixel-Perfect Depth (987 stars, Python, Models)
- NVlabs/PixelDiT - [CVPR 2026 Oral] Pixel Diffusion Transformers for Image Generation (558 stars, Python, Models)
- microsoft/BitNet - Official inference framework for 1-bit LLMs (38973 stars, Python, Models)
- openai/privacy-filter - OpenAI Privacy Filter (2125 stars, Python, Models)
- facebookresearch/sapiens2 - 1K resolution vision transformers pretrained on 1B human images. (653 stars, Python, Models)
- ultralytics/yolov5 - YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite (57374 stars, Python, Models)
- 2noise/ChatTTS - A generative speech model for daily dialogue. (39254 stars, Python, Models)
- openai/CLIP - CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image (33487 stars, Jupyter Notebook, Models)
- facebookresearch/sam2 - The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints,... (19151 stars, Jupyter Notebook, Models)
- microsoft/fara - Fara-7B: An Efficient Agentic Model for Computer Use (5087 stars, Python, Models)
- facebookresearch/vjepa2 - PyTorch code and models for VJEPA2 self-supervised learning from video. (3893 stars, Python, Models)
- NVIDIA/DreamDojo - Official Codebase for "DreamDojo: A Generalist Robot World Model from Large-Scale Human Videos" (843 stars, Python, Models)
- deepinsight/insightface - State-of-the-art 2D and 3D Face Analysis Project (28671 stars, Python, Models)
- NVIDIA/personaplex - PersonaPlex code. (9841 stars, Python, Models)
- snakers4/silero-vad - Silero VAD: pre-trained enterprise-grade Voice Activity Detector (9038 stars, Python, Models)
- QwenLM/Qwen3.6 - Qwen3.6 is the large language model series developed by Qwen team, Alibaba Group. (3374 stars, Unknown, Models)
- Tencent-Hunyuan/HY-World-2.0 - HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds (1887 stars, Python, Models)
- talkie-lm/talkie - talkie is a vintage language model from 1930 (820 stars, Python, Models)
- yuantianyuan01/FastWAM - Official codebase for Fast-WAM: Do World Action Models Need Test-time Future Imagination? (699 stars, Python, Models)
- DLR-RM/stable-baselines3 - PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms. (13260 stars, Python, Models)
- MoonshotAI/Kimi-K2 - Kimi K2 is the large language model series developed by Moonshot AI team (10769 stars, Unknown, Models)
- kyutai-labs/moshi - Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec. (10202 stars, Python, Models)
- Tencent-Hunyuan/HunyuanVideo-1.5 - HunyuanVideo-1.5: A leading lightweight video generation model (4508 stars, Python, Models)
- deepseek-ai/DeepSeek-OCR-2 - Visual Causal Flow (2843 stars, Python, Models)
- pnnbao97/VieNeu-TTS - Vietnamese TTS with instant voice cloning • On-device • Real-time CPU inference • 24kHz audio quality • Chuyển văn bản thành giọng nói tiếng Việt • Text... (1540 stars, Python, Models)
- facebookresearch/segment-anything - The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and exampl... (54147 stars, Jupyter Notebook, Models)
- huggingface/diffusers - 🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch. (33614 stars, Python, Models)
- mlfoundations/open_clip - An open source implementation of CLIP. (13802 stars, Python, Models)