8a4beb9d7f
修正查库顺序 |
||
---|---|---|
ai_module | ||
android_connector_demo | ||
bin | ||
core | ||
gui | ||
images | ||
python_connector_demo | ||
scheduler | ||
shell | ||
test | ||
utils | ||
.gitignore | ||
config.json | ||
favicon.ico | ||
fay_booter.py | ||
icon.png | ||
LICENSE | ||
main.py | ||
qa_demo.xlsx | ||
README_EN.md | ||
README.md | ||
requirements.txt | ||
system.conf | ||
WebSocket.md | ||
yolov8n-pose.pt |
Fay Digital Human Assistant Edition is an important branch of the Fay open-source project, focusing on building open-source solutions for intelligent digital assistants. It offers a flexible and modular design that allows developers to customize and combine various functional modules, including emotion analysis, NLP processing, speech synthesis, and speech output, among others. Fay Digital Assistant Edition provides developers with powerful tools and resources for building intelligent, personalized, and multifunctional digital assistant applications. With this edition, developers can easily create digital assistants applicable to various scenarios and domains, providing users with intelligent voice interactions and personalized services.
Fay Digital Assistant Edition
ProTip:The shopping edition has been moved to a separate branch.fay-sales-edition
*Assistant Fay controller use: voice communication, voice and text reply;**Text communication, text reply;*To connect UE, live2d, and xuniren, you need to close the panel for playback.
Assistant Fay controller
Remote Android Local PC Remote PC
└─────────────┼─────────────┘
Aliyun API ─┐ │
├── ASR
FunASR ─┘ │ ┌─ Yuan 1.0
│ ├─ LingJu
NLP ────┼─ GPT/ChatGPT
│ ├─ Rasa+ChatGLM-6B
Azure ─┐ │ ├─ VisualGLM
Edge TTS ─┼── TTS └─ RWKV
Open source TTS ─┘ │
│
│
┌──────────┬────┼───────┬─────────┐
Remote Android Live2D UE xuniren Remote PC
Important: Communication interface between Fay (server) and digital human (client): 'ws://127.0.0.1:10002' (connected)
Message format: View WebSocket.md
代码结构
.
├── main.py # Program main entry
├── fay_booter.py # Core boot module
├── config.json # Controller configuration file
├── system.conf # System configuration file
├── ai_module
│ ├── ali_nls.py # Aliyun Real-time Voice
│ ├── ms_tts_sdk.py # Microsoft Text-to-Speech
│ ├── nlp_lingju.py # Lingju Human-Machine Interaction - Natural Language Processing
│ ├── xf_aiui.py # Xunfei Human-Machine Interaction - Natural Language Processing
│ ├── nlp_gpt.py # GPT API integration
│ ├── nlp_chatgpt.py # Reverse integration with chat.openai.com
│ ├── nlp_yuan.py # Langchao. Yuan model integration
│ ├── nlp_rasa.py # Preceding Rasa conversation management based on ChatGLM-6B (highly recommended)
│ ├── nlp_VisualGLM.py # Integration with multimodal large language model VisualGLM-6B
│ ├── nlp_rwkv.py # Offline integration with rwkv
│ ├── nlp_rwkv_api.py # rwkv server API
│ ├── yolov8.py # YOLOv8 object detection
│ └── xf_ltp.py # Xunfei Sentiment Analysis
├── bin # Executable file directory
├── core # Digital Human Core
│ ├── fay_core.py # Digital Human Core module
│ ├── recorder.py # Recorder
│ ├── tts_voice.py # Speech synthesis enumeration
│ ├── authorize_tb.py # fay.db authentication table management
│ ├── content_db.py # fay.db content table management
│ ├── interact.py # Interaction (message) object
│ ├── song_player.py # Music player (currently unavailable)
│ └── wsa_server.py # WebSocket server
├── gui # Graphical interface
│ ├── flask_server.py # Flask server
│ ├── static
│ ├── templates
│ └── window.py # Window module
├── scheduler
│ └── thread_manager.py # Scheduler manager
├── utils # Utility modules
│ ├── config_util.py
│ ├── storer.py
│ └── util.py
└── test # All surprises
Upgrade Log
2023.08.23:
- Replace the GPT docking method;
- Add chatglm2 docking.
2023.08.16:
- Optimized the issue of high system resource consumption caused by UE repeatedly reconnecting;
- Automatically control whether to start panel playback;
- Automatically delete runtime logs.
2023.08.09:
- Remove mp3 format warning message;
- Remove Lingju and Rwkv interface warning message;
- Optimize websocket logic;
- Optimize digital human interface communication.
2023.08.04:
- UE5 project updated.
- Audio-visual pixel for lip-reading is replaced by 33ms.
- Built-in rwkv_api nlp can be used directly.
- The frequency of emotional pushing to digital human terminal is reduced.
- No interface message is generated when the digital human is not connected.
- The problem that the playback information is not pushed to the digital human terminal with a certain probability due to the wrong mp3 format is fixed.
- The problem that the nlp logic is ended early when commands such as mute are executed, and the user's question message is not pushed to the digital human terminal is fixed.
- wav file startup cleaning is supplemented.
- WebSocket tool class is upgraded and improved.
2023.07:
-
Add runtime automatic cleaning of UI cache;
-
Add GPT proxy setting can be null;
-
Improve the stability of Lingju docking.
-
Fixed the problem of generating a large amount of WS information before connecting digital humans;
-
Add digital human (UE, Live2D, Xuniren) communication interface: real-time logs;
-
Update digital human (UE, Live2D, Xuniren) communication interface: audio push.
-
Multiple updates for the merchandise version.
-
Fixed the issue of remote voice recognition.
-
Fixed the issue of occasional unresponsiveness during ASR (Automatic Speech Recognition).
-
Removed the singing command.
-
Fixed Linux and macOS runtime errors.
-
Fixed the issue of being unable to continue execution due to lip-sync errors.
-
Provided an integration solution for RWKV.
-
Fixed an issue in Assistant Edition where text input does not read persona responses.
-
Fixed an issue in Assistant Edition where text input does not read QA responses.
-
Enhanced microphone stability.
- Fixed a sound playback issue caused by the inability to run the lip-sync algorithm.
2023.06:
- Refactored NLP module management logic for easier extension.
- Split GPT into ChatGPT and GPT, replaced with a new GPT interface, and added the ability to configure proxy servers separately.
- Specified the version of the YOLOv8 package to resolve YOLO compatibility issues.
- Fixed self-talk bug and receiving multiple messages to be processed bug.
- Integrated Lingju NLP API (supporting GPT3.5 and multiple applications).
- UI corrections.
- Integrated local lip-sync algorithm.
- Resolved compatibility issues with multi-channel microphones.
- Refactored fay_core.py and fay_booter.py code.
- UI layout adjustments.
- Restored sound selection.
- Fixed logic for displaying "Thinking..."
Installation Instructions
Environment
- Python 3.9、3.10
- Windows、macos、linux
Installing Dependencies
pip install -r requirements.txt
Configuring Application Key
- View API Modules
- Browse the link, register, and create an application. Fill in the application key in
./system.conf
Starting
Starting Fay Controller
python main.py
API Modules
Application Key needs to be filled in before starting
File | Description | Link |
---|---|---|
./ai_module/ali_nls.py | Real-time Speech Recognition (Optional) | https://ai.aliyun.com/nls/trans |
./ai_module/ms_tts_sdk.py | Microsoft Text-to-Speech with Emotion (Optional) | https://azure.microsoft.com/zh-cn/services/cognitive-services/text-to-speech/ |
./ai_module/xf_ltp.py | Xunfei Sentiment Analysis(Optional) | https://www.xfyun.cn/service/emotion-analysis |
./utils/ngrok_util.py | ngrok.cc External Network Penetration (optional) | http://ngrok.cc |
./ai_module/nlp_lingju.py | Lingju NLP API (supports GPT3.5 and multiple applications)(Optional) | https://open.lingju.ai Contact customer service to enable GPT3.5 access |
./ai_module/yuan_1_0.py | Langchao Yuan Model (Optional) | https://air.inspur.com/ |
Instructions for Use
Instructions for Use
- Voice Assistant: Fay Controller (with microphone input source enabled and panel playback enabled).
- Remote Voice Assistant: Fay Controller (with panel playback disabled) + Remote device integration.
- Digital Human Interaction: Fay Controller (with microphone input source enabled, panel playback disabled, and personality Q&A filled) + Digital Human.
- Jarvis, Her: Join us to complete the experience together.
Voice Commands
Shut down | Mute | Unmute |
---|---|---|
Shut down, Goodbye, Go away | Mute, Be quiet, I want silence | Unmute, Where are you, You can speak now |
For business inquiries
**business QQ **: 467665317