#

vision

Here are 1,685 public repositories matching this topic...

BVLC / caffe

Caffe: a fast open framework for deep learning.

machine-learning deep-learning vision

Updated Jul 31, 2024
C++

XTLS / Xray-core

Xray, Penetrates Everything. Also the best v2ray-core, with XTLS support. Fully compatible configuration.

Updated Dec 18, 2024
Go

danny-avila / LibreChat

Enhanced ChatGPT Clone: Features Agents, Anthropic, AWS, OpenAI, Assistants API, Azure, Groq, o1, GPT-4o, Mistral, OpenRouter, Vertex AI, Gemini, Artifacts, AI model switching, message search, Code Interpreter, langchain, DALL-E-3, OpenAPI Actions, Functions, Secure Multi-User Auth, Presets, open-source for self-hosting. Active project.

Updated Dec 18, 2024
TypeScript

PaddleHub

PaddlePaddle / PaddleHub

Awesome pre-trained models toolkit based on PaddlePaddle. (400+ models including Image, Text, Audio, Video and Cross-Modal with Easy Inference & Serving)【安全加固，暂停交互，请耐心等待】

nlp awesome deep-learning model vision text2image

Updated Aug 7, 2024
Python

Skyvern-AI / skyvern

Automate browser-based workflows with LLMs and Computer Vision

python api workflow automation browser computer vision gpt browser-automation rpa playwright llm

Updated Dec 18, 2024
Python

mediar-ai / screenpipe

build ai agents that have the full context, open source, runs locally, developer friendly. 24/7 screen, mic, keyboard recording and control

machine-learning ai computer-vision ml agi vision agents multimodal llm

Updated Dec 18, 2024
TypeScript

mrousavy / react-native-vision-camera

📸 A powerful, high-performance React Native Camera library.

Updated Dec 16, 2024
Swift

Dooy / chatgpt-web-midjourney-proxy

One UI is all done with chatgpt web, midjourney, gpts,suno,luma,runway,viggle,flux,ideogram,realtime,pika,udio; Simultaneous support Web / PWA / Linux / Win / MacOS platform

flux realtime vision runway pika ideogram luma gpts midjourney chatgpt-ui midjourney-ui gptstore gpts-ui whisper-ui suno claude-3 udio viggle kling

Updated Dec 17, 2024
JavaScript

TEN-framework / TEN-Agent

TEN Agent is a conversational AI powered by TEN, integrating Gemini 2.0 Multimodal Live API, OpenAI Realtime API, RTC, and more. It offers real-time capabilities to see, hear, and speak, along with advanced tools like weather checks, web search, and RAG.

Updated Dec 18, 2024
Python

iOS-11-by-Examples

artemnovichkov / iOS-11-by-Examples

👨🏻‍💻 Examples of new iOS 11 APIs

swift vision xcode9 ios11 arkit coreml core-nfc

Updated Dec 31, 2021
Swift

donkeycar

autorope / donkeycar

Open source hardware and software platform to build a small scale self driving car.

python raspberry-pi tensorflow keras vision self-driving-car cv2 donkeycar jetson-nano

Updated Sep 15, 2024
Python

sightmachine / SimpleCV

The Open Source Framework for Machine Vision

python computer-vision cv image-processing vision visionprocessing

Updated Jan 14, 2023
Python

NextLevel / NextLevel

⬆️ Media Capture in Swift

swift ios instagram snapchat video camera custom photography augmented-reality ar media avfoundation capture vision nextlevel mixed-reality coreimage arkit tiktok

Updated Aug 12, 2024
Swift

GoogleCloudPlatform / java-docs-samples

Java and Kotlin Code samples used on cloud.google.com

kotlin java appengine video cdn auth samples vision translate automl

Updated Dec 18, 2024
Java

andyzeng / tsdf-fusion-python

Python code to fuse multiple RGB-D images into a TSDF voxel volume.

cuda artificial-intelligence vision rgbd 3d 3d-reconstruction depth-camera volumetric-data 3d-deep-learning tsdf kinect-fusion

Updated Feb 18, 2023
Python

roatienza / Deep-Learning-Experiments

Videos, notes and experiments to understand deep learning

nlp deep-learning speech pytorch artificial-intelligence vision deep-learning-tutorial

Updated Dec 15, 2024
Jupyter Notebook

KevinGong2013 / ChineseIDCardOCR

[Deprecated] 🇨🇳中国二代身份证光学识别

swift machine-learning deep-learning xcode cnn vision ios11 coreml

Updated Feb 7, 2018
Swift

lucidrains / mlp-mixer-pytorch

An All-MLP solution for Vision, from Google AI

deep-learning vision

Updated Sep 13, 2024
Python

OpenFind

aheze / OpenFind

An app to find text in real life.

swift photos ios app ocr camera uikit find realm vision hacktoberfest swiftui

Updated Feb 10, 2023
Swift

jenly1314 / MLKit

🌝 MLKit是一个强大易用的工具包。通过ML Kit您可以很轻松的实现文字识别、条码识别、图像标记、人脸检测、对象检测等功能。

android machine-learning ocr recognition qrcode barcode vision text-recognition face-detection machine-learning-library object-detection object-recognition barcode-scanning mlkit image-labeling camerax pose-detection segmentation-selfie

Updated Aug 18, 2024
Java

Improve this page

Add a description, image, and links to the vision topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the vision topic, visit your repo's landing page and select "manage topics."