#计算机科学#Code for Machine Learning for Algorithmic Trading, 2nd edition.
翻译 - 《算法交易机器学习的代码和资源》,第二版。
Mimesis is a robust data generator for Python that can produce a wide range of fake data in multiple languages.
翻译 - Mimesis是适用于Python的软件包,可帮助以多种语言生成大量用于各种目的的假数据。
#自然语言处理#Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷
Open source data anonymization and synthetic data platform for developers. Anonymize your production data and sync it across your environments so that developers can safely use it.
#计算机科学#The easiest tool for fine-tuning LLM models, synthetic data generation, and collaborating on datasets.
A procedural Blender pipeline for photorealistic training image generation
翻译 - 用于生成逼真训练图像的程序化 Blender 管道
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
#计算机科学#Synthetic data generation for tabular data
翻译 - 表格、关系和时间序列数据的合成数据生成。
Synthetic Patient Population Simulator
翻译 - 综合患者人口模拟器
#大语言模型#SDG is a specialized framework designed to generate high-quality structured tabular data.
#计算机科学#UnrealCV: Connecting Computer Vision to Unreal Engine
翻译 - UnrealCV:将计算机视觉连接到虚幻引擎
#计算机科学#Synthetic data generators for tabular and time-series data
The Declarative Data Generator
Conditional GAN for generating synthetic tabular data.
PostgreSQL database anonymization and synthetic data generation tool
#自然语言处理#DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. 🤖💤
#自然语言处理#Synthetic data curation for post-training and structured data extraction
A framework for comprehensive diagnosis and optimization of agents using simulated, realistic synthetic interactions
#大语言模型#A lightweight library for generating synthetic instruction tuning datasets for your data without GPT.
#自然语言处理#Curated list of open source tooling for data-centric AI on unstructured data.