今天星期三
+ 修复连接数字人之前产生大量ws信息问题; + 增加数字人(ue、live2d、xuniren)通讯接口:实时日志; + 更新数字人(ue、live2d、xuniren)通讯接口:音频推送。
This commit is contained in:
parent
fb8caf7645
commit
c99ee0cc5a
215
README.md
215
README.md
@ -1,3 +1,5 @@
|
||||
[`English`](https://github.com/TheRamU/Fay/blob/main/README_EN.md)
|
||||
|
||||
<div align="center">
|
||||
<br>
|
||||
<img src="images/icon.png" alt="Fay">
|
||||
@ -8,24 +10,6 @@
|
||||
|
||||
Fay数字人助理版是fay开源项目的重要分支,专注于构建智能数字助理的开源解决方案。它提供了灵活的模块化设计,使开发人员能够定制和组合各种功能模块,包括情绪分析、NLP处理、语音合成和语音输出等。Fay数字人助理版为开发人员提供了强大的工具和资源,用于构建智能、个性化和多功能的数字助理应用。通过该版本,开发人员可以轻松创建适用于各种场景和领域的数字人助理,为用户提供智能化的语音交互和个性化服务。
|
||||
|
||||
## **推荐玩法**
|
||||
|
||||
|
||||
|
||||
灵聚NLP api(支持GPT3.5及多应用):https://m.bilibili.com/video/BV1NW4y1D76a
|
||||
|
||||
集成本地唇型算法:https://www.bilibili.com/video/BV1Zh4y1g7o7/?buvid=XXDD0B5DD6C43C070DF9E7E67930FC48B24DF&is_story_h5=false&mid=Pvwl%2Ft1ahPM726k1L4%2FnRA%3D%3D&plat_id=202&share_from=ugc&share_medium=android&share_plat=android&share_source=WEIXIN&share_tag=s_i×tamp=1686926382&unique_k=Jdqazy3&up_id=2111554564
|
||||
|
||||
给数字人加上眼睛(集成yolo+VisualGLM):B站视频
|
||||
|
||||
给Fay加上本地免费语音识别(达摩院funaar): https://www.bilibili.com/video/BV1qs4y1g74e/?share_source=copy_web&vd_source=64cd9062f5046acba398177b62bea9ad
|
||||
|
||||
消费级pc大模型(ChatGLM-6B的基础上前置Rasa会话管理):https://m.bilibili.com/video/BV1D14y1f7pr
|
||||
|
||||
UE5工程:https://github.com/xszyou/fay-ue5
|
||||
|
||||
真人视频三维重建(NeRF):https://github.com/waityousea/xuniren
|
||||
|
||||
|
||||
|
||||
## **Fay数字人助理版**
|
||||
@ -34,78 +18,69 @@ UE5工程:https://github.com/xszyou/fay-ue5
|
||||
|
||||

|
||||
|
||||
助理版Fay控制器使用:语音沟通,语音和文字回复;文字沟通,文字回复。
|
||||
助理版Fay控制器使用:语音沟通,语音和文字回复;文字沟通,文字回复;对接UE、live2d、xuniren,需关闭面板播放。
|
||||
|
||||
|
||||
|
||||
### **PC远程助理** [`PC demo`](https://github.com/TheRamU/Fay/tree/main/python_connector_demo)
|
||||
|
||||
### **手机远程助理** [`android demo`](https://github.com/TheRamU/Fay/tree/main/android_connector_demo)
|
||||
## **二、Fay助理版**
|
||||
|
||||
|
||||
|
||||
|
||||
### **与数字形象通讯**(非必须,控制器需要关闭“面板播放”)
|
||||
Remote Android Local PC Remote PC
|
||||
|
||||
控制器与采用 WebSocket 方式与 UE 通讯
|
||||
└─────────────┼─────────────┘
|
||||
|
||||
|
||||
Aliyun API ─┐ │
|
||||
|
||||
|
||||
├── ASR
|
||||
|
||||
|
||||
[FunASR](https://www.bilibili.com/video/BV1qs4y1g74e) ─┘ │ ┌─ Yuan 1.0
|
||||
|
||||
│ ├─ [LingJu](https://www.bilibili.com/video/BV1NW4y1D76a/)
|
||||
|
||||
NLP ────┼─ [GPT/ChatGPT](https://www.bilibili.com/video/BV1Dg4y1V7pn)
|
||||
|
||||
│ ├─ [Rasa+ChatGLM-6B](https://www.bilibili.com/video/BV1D14y1f7pr)
|
||||
|
||||
Azure ─┐ │ ├─ [VisualGLM](https://www.bilibili.com/video/BV1mP411Q7mj)
|
||||
|
||||
Edge TTS ─┼── TTS └─ [RWKV](https://www.bilibili.com/video/BV1yu41157zB)
|
||||
|
||||
[开源 TTS](https://www.bilibili.com/read/cv25192534) ─┘ │
|
||||
|
||||
│
|
||||
|
||||
│
|
||||
|
||||
┌──────────┬────┼───────┬─────────┐
|
||||
|
||||

|
||||
|
||||
下载工程: [https://pan.baidu.com/s/1RBo2Pie6A5yTrCf1cn_Tuw?pwd=ck99](https://pan.baidu.com/s/1RBo2Pie6A5yTrCf1cn_Tuw?pwd=ck99)
|
||||
|
||||
下载windows运行包: [https://pan.baidu.com/s/1CsJ647uV5rS2NjQH3QT0Iw?pwd=s9s8](https://pan.baidu.com/s/1CsJ647uV5rS2NjQH3QT0Iw?pwd=s9s8)
|
||||
Remote Android [Live2D](https://www.bilibili.com/video/BV1sx4y1d775/?vd_source=564eede213b9ddfa9a10f12e5350fd64) [UE](https://www.bilibili.com/read/cv25133736) [xuniren](https://www.bilibili.com/read/cv24997550) Remote PC
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||

|
||||
|
||||
工程:https://github.com/xszyou/fay-ue5
|
||||
|
||||
|
||||
|
||||
重要:
|
||||
|
||||
Fay(服务端)与数字人的通讯接口: [`ws://127.0.0.1:10002`](ws://127.0.0.1:10002)(已接通)
|
||||
重要:Fay(服务端)与数字人(客户端)的通讯接口: [`ws://127.0.0.1:10002`](ws://127.0.0.1:10002)(已接通)
|
||||
|
||||
消息格式: 查看 [WebSocket.md](https://github.com/TheRamU/Fay/blob/main/WebSocket.md)
|
||||
|
||||
|
||||
|
||||
### **与远程音频输入输出设备连接**(非必须,外网需要配置http://ngrok.cc tcp通道的clientid)
|
||||
|
||||
控制器与采用 socket(非websocket) 方式与 音频输出设备通讯
|
||||
|
||||
内网通讯地址: [`ws://127.0.0.1:10001`](ws://127.0.0.1:10001)
|
||||
|
||||
外网通讯地址: 通过http://ngrok.cc获取(有伙伴愿意赞助服务器给社区免费使用吗?)
|
||||
|
||||

|
||||
消息格式: 参考 [remote_audio.py](https://github.com/TheRamU/Fay/blob/main/python_connector_demo/remote_audio.py)
|
||||

|
||||
|
||||
|
||||
|
||||
|
||||
## **二、Fay控制器核心逻辑**
|
||||
|
||||

|
||||
|
||||
**注:**
|
||||
|
||||
以上每个模块可轻易替换成自家核心产品。
|
||||
|
||||
|
||||
### **目录结构**
|
||||
### ***Code structure***
|
||||
|
||||
```
|
||||
.
|
||||
├── main.py # 程序主入口
|
||||
├── fay_booter.py # 核心启动模块
|
||||
├── config.json # 控制器配置文件
|
||||
├── system.conf # 系统配置文件
|
||||
├── main.py # 程序主入口
|
||||
├── fay_booter.py # 核心启动模块
|
||||
├── config.json # 控制器配置文件
|
||||
├── system.conf # 系统配置文件
|
||||
├── ai_module
|
||||
│ ├── ali_nls.py # 阿里云 实时语音
|
||||
│ ├── ali_nls.py # 阿里云 实时语音
|
||||
│ ├── ms_tts_sdk.py # 微软 文本转语音
|
||||
│ ├── nlp_lingju.py # 灵聚 人机交互-自然语言处理
|
||||
│ ├── xf_aiui.py # 讯飞 人机交互-自然语言处理
|
||||
@ -114,6 +89,8 @@ Fay(服务端)与数字人的通讯接口: [`ws://127.0.0.1:10002`](ws://127
|
||||
│ ├── nlp_yuan.py # 浪潮.源大模型对接
|
||||
│ ├── nlp_rasa.py # ChatGLM-6B的基础上前置Rasa会话管理(强烈推荐)
|
||||
│ ├── nlp_VisualGLM.py # 对接多模态大语言模型VisualGLM-6B
|
||||
│ ├── nlp_rwkv.py # 离线对接rwkv
|
||||
│ ├── nlp_rwkv_api.py # rwkv server api
|
||||
│ ├── yolov8.py # yolov8资态识别
|
||||
│ └── xf_ltp.py # 讯飞 情感分析
|
||||
├── bin # 可执行文件目录
|
||||
@ -142,6 +119,30 @@ Fay(服务端)与数字人的通讯接口: [`ws://127.0.0.1:10002`](ws://127
|
||||
|
||||
|
||||
## **三、升级日志**
|
||||
|
||||
**2023.07.26:**
|
||||
|
||||
+ 修复连接数字人之前产生大量ws信息问题;
|
||||
+ 增加数字人(ue、live2d、xuniren)通讯接口:实时日志;
|
||||
+ 更新数字人(ue、live2d、xuniren)通讯接口:音频推送。
|
||||
|
||||
**2023.07.21:**
|
||||
|
||||
+ 带货版多项更新;
|
||||
|
||||
|
||||
**2023.07.19:**
|
||||
|
||||
+ 修复远程语音不识别问题;
|
||||
+ 修复asr时有不灵问题;
|
||||
+ 去除唱歌指令。
|
||||
|
||||
**2023.07.14:**
|
||||
|
||||
+ 修复linux及mac运行出错问题;
|
||||
+ 修复因唇型出错无法继续执行问题;
|
||||
+ 提供rwkv对接方案。
|
||||
|
||||
**2023.07.12:**
|
||||
|
||||
+ 修复助理版文字输入不读取人设回复问题;
|
||||
@ -152,39 +153,31 @@ Fay(服务端)与数字人的通讯接口: [`ws://127.0.0.1:10002`](ws://127
|
||||
|
||||
+ 修复无法运行唇型算法而导致的不播放声音问题。
|
||||
|
||||
**2023.06.28:**
|
||||
**2023.06:**
|
||||
|
||||
+ 重构NLP模块管理逻辑,便于自由扩展;
|
||||
+ gpt:拆分为ChatGPT及GPT、更换新的GPT接口、可单独配置代理服务器;
|
||||
+ 指定yolov8包版本,解决yolo不兼容问题;
|
||||
+ 修复:自言自语bug、接收多个待处理消息bug。
|
||||
|
||||
**2023.06.21:**
|
||||
|
||||
+ 集成灵聚NLP api(支持GPT3.5及多应用);
|
||||
+ ui修正。
|
||||
|
||||
**2023.06.17:**
|
||||
|
||||
+ 集成本地唇型算法。
|
||||
|
||||
**2023.06.14:**
|
||||
|
||||
+ 解决多声道麦克风兼容问题;
|
||||
+ 重构fay_core.py及fay_booter.py代码;
|
||||
+ ui适应布局调整;
|
||||
+ 恢复声音选择;
|
||||
+ ”思考中...“显示逻辑修复。
|
||||
|
||||
**2023.05.27:**
|
||||
**2023.05:**
|
||||
|
||||
+ 修复多个bug:消息框换行及空格问题、语音识别优化;
|
||||
+ 彩蛋转正,Fay沟通与ChatGPT并行;
|
||||
+ 加入yolov8姿态识别;
|
||||
+ 加入VisualGLM-6B多模态单机离线大语言模型。
|
||||
|
||||
**2023.05.12:**
|
||||
|
||||
+ 打出Fay数字人助理版作为主分支(带货版移到分支[`fay-sales-edition`](https://github.com/TheRamU/Fay/tree/fay-sales-edition));
|
||||
+ 添加Fay助理的文字沟通窗口(文字与语音同步);
|
||||
+ 添加沟通记录本地保存功能;
|
||||
@ -205,7 +198,7 @@ pip install -r requirements.txt
|
||||
```
|
||||
|
||||
### **配置应用密钥**
|
||||
+ 查看 [AI 模块](#ai-模块)
|
||||
+ 查看 [API 模块](#ai-模块)
|
||||
+ 浏览链接,注册并创建应用,将应用密钥填入 `./system.conf` 中
|
||||
|
||||
### **启动**
|
||||
@ -215,21 +208,18 @@ python main.py
|
||||
```
|
||||
|
||||
|
||||
### **AI 模块**
|
||||
### **API 模块**
|
||||
启动前需填入应用密钥
|
||||
|
||||
| 代码模块 | 描述 | 链接 |
|
||||
| ------------------------- | -------------------------- | ------------------------------------------------------------ |
|
||||
| ./ai_module/ali_nls.py | 实时语音识别(非必须,免费3个月,asr二选一) | https://ai.aliyun.com/nls/trans |
|
||||
| ./ai_module/funasr.py | 达摩院开源免费本地asr (非必须,asr二选一) | fay/test/funasr/README.MD |
|
||||
| ./ai_module/ms_tts_sdk.py | 微软 文本转情绪语音(非必须,不配置时使用免费的edge-tts) | https://azure.microsoft.com/zh-cn/services/cognitive-services/text-to-speech/ |
|
||||
| ./ai_module/xf_ltp.py | 讯飞 情感分析 | https://www.xfyun.cn/service/emotion-analysis |
|
||||
| ./ai_module/ali_nls.py | 实时语音识别(可选) | https://ai.aliyun.com/nls/trans |
|
||||
| ./ai_module/ms_tts_sdk.py | 微软 文本转情绪语音(可选) | https://azure.microsoft.com/zh-cn/services/cognitive-services/text-to-speech/ |
|
||||
| ./ai_module/xf_ltp.py | 讯飞 情感分析(可选) | https://www.xfyun.cn/service/emotion-analysis |
|
||||
| ./utils/ngrok_util.py | ngrok.cc 外网穿透(可选) | http://ngrok.cc |
|
||||
| ./ai_module/nlp_lingju.py | 灵聚NLP api(支持GPT3.5及多应用)(NLP多选1) | https://open.lingju.ai 需联系客服务开通gpt3.5权限|
|
||||
| ./ai_module/yuan_1_0.py | 浪潮源大模型(NLP 多选1) | https://air.inspur.com/ |
|
||||
| ./ai_module/chatgpt.py | ChatGPT(NLP多选1) | ******* |
|
||||
| ./ai_module/nlp_rasa.py | ChatGLM-6B的基础上前置Rasa会话管理(NLP 多选1) | https://m.bilibili.com/video/BV1D14y1f7pr |
|
||||
| ./ai_module/nlp_VisualGLM.py | 对接VisualGLM-6B多模态单机离线大语言模型(NLP 多选1) | B站视频 |
|
||||
| ./ai_module/nlp_lingju.py | 灵聚NLP api(支持GPT3.5及多应用)(可选) | https://open.lingju.ai 需联系客服务开通gpt3.5权限|
|
||||
| ./ai_module/yuan_1_0.py | 浪潮源大模型(可选) | https://air.inspur.com/ |
|
||||
| | | |
|
||||
|
||||
|
||||
|
||||
@ -250,55 +240,12 @@ python main.py
|
||||
| ------------------------- | -------------------------- | ------------------------------------------------------------ |
|
||||
| 关闭、再见、你走吧 | 静音、闭嘴、我想静静 | 取消静音、你在哪呢、你可以说话了 |
|
||||
|
||||
| 播放歌曲(音乐库暂不可用) | 暂停播放 | 更多 |
|
||||
| ------------------------- | -------------------------- | ------------------------------------------------------------ |
|
||||
| 播放歌曲、播放音乐、唱首歌、放首歌、听音乐、你会唱歌吗 | 暂停播放、别唱了、我不想听了 | 没有了... |
|
||||
|
||||
### **人设**
|
||||
数字人属性,与用户交互中能做出相应的响应。
|
||||
#### 交互灵敏度
|
||||
在交互中,数字人能感受用户的情感,并作出反应。最直的体现,就是语气的变化,如 开心/伤心/生气 等。
|
||||
设置灵敏度,可改变用户情感对于数字人的影响程度。
|
||||
|
||||
### **接收来源**
|
||||
|
||||
#### 文本输入
|
||||
|
||||
通过沟通窗口与助理文本沟通
|
||||
|
||||
#### 麦克风
|
||||
|
||||
选择麦克风设备,实现面对面交互,成为你的伙伴
|
||||
|
||||
#### socket远程音频输入
|
||||
|
||||
可以接入远程音频输入,远程音频输出
|
||||
|
||||
|
||||
### **联系**
|
||||
|
||||
### 相关文章:
|
||||
1、集成消费级pc大模型(ChatGLM-6B的基础上前置Rasa会话管理):https://m.bilibili.com/video/BV1D14y1f7pr
|
||||
**商务QQ: 467665317**
|
||||
|
||||
2、[(34条消息) 非常全面的数字人解决方案_郭泽斌之心的博客-CSDN博客_数字人算法](https://blog.csdn.net/aa84758481/article/details/124758727)
|
||||
|
||||
3、【开源项目:数字人FAY——Fay新架构使用讲解】 https://www.bilibili.com/video/BV1NM411B7Ab/?share_source=copy_web&vd_source=64cd9062f5046acba398177b62bea9ad
|
||||
|
||||
4、【开源项目FAY——UE工程讲解】https://www.bilibili.com/video/BV1C8411P7Ac?vd_source=64cd9062f5046acba398177b62bea9ad
|
||||
|
||||
5、m1机器安装办法(Gason提供):https://www.zhihu.com/question/437075754
|
||||
|
||||
6、bilbil主页:[xszyou的个人空间_哔哩哔哩_bilibili](https://space.bilibili.com/2111554564)
|
||||
|
||||
|
||||
|
||||
商务联系QQ 467665317,我们提供:开发顾问、数字人模型定制及高校教学资源实施服务
|
||||
http://yafrm.com/forum.php?mod=viewthread&tid=302
|
||||
|
||||
关注公众号(fay数字人)获取最新微信技术交流群二维码(**请先star本仓库**)
|
||||
**交流群及资料教程**关注公众号 **fay数字人**(**请先star本仓库**)
|
||||
|
||||

|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
238
README_EN.md
Normal file
238
README_EN.md
Normal file
@ -0,0 +1,238 @@
|
||||
[`中文`](https://github.com/TheRamU/Fay/blob/main/README.md)
|
||||
|
||||
<div align="center">
|
||||
<br>
|
||||
<img src="images/icon.png" alt="Fay">
|
||||
<h1>FAY</h1>
|
||||
<h3>Fay Digital Human Assistant</h3>
|
||||
</div>
|
||||
|
||||
|
||||
Fay Digital Human Assistant Edition is an important branch of the Fay open-source project, focusing on building open-source solutions for intelligent digital assistants. It offers a flexible and modular design that allows developers to customize and combine various functional modules, including emotion analysis, NLP processing, speech synthesis, and speech output, among others. Fay Digital Assistant Edition provides developers with powerful tools and resources for building intelligent, personalized, and multifunctional digital assistant applications. With this edition, developers can easily create digital assistants applicable to various scenarios and domains, providing users with intelligent voice interactions and personalized services.
|
||||
|
||||
|
||||
|
||||
## Fay Digital Assistant Edition
|
||||
|
||||
ProTip:The shopping edition has been moved to a separate branch.[`fay-sales-edition`](https://github.com/TheRamU/Fay/tree/fay-sales-edition)
|
||||
|
||||

|
||||
|
||||
*Assistant Fay controller use: voice communication, voice and text reply;**Text communication, text reply;**To connect UE, live2d, and xuniren, you need to close the panel for playback.*
|
||||
|
||||
|
||||
|
||||
|
||||
## **Assistant Fay controller**
|
||||
|
||||
Remote Android Local PC Remote PC
|
||||
|
||||
└─────────────┼─────────────┘
|
||||
|
||||
|
||||
Aliyun API ─┐ │
|
||||
|
||||
|
||||
├── ASR
|
||||
|
||||
|
||||
[FunASR](https://www.bilibili.com/video/BV1qs4y1g74e) ─┘ │ ┌─ Yuan 1.0
|
||||
|
||||
│ ├─ [LingJu](https://www.bilibili.com/video/BV1NW4y1D76a/)
|
||||
|
||||
NLP ────┼─ [GPT/ChatGPT](https://www.bilibili.com/video/BV1Dg4y1V7pn)
|
||||
|
||||
│ ├─ [Rasa+ChatGLM-6B](https://www.bilibili.com/video/BV1D14y1f7pr)
|
||||
|
||||
Azure ─┐ │ ├─ [VisualGLM](https://www.bilibili.com/video/BV1mP411Q7mj)
|
||||
|
||||
Edge TTS ─┼── TTS └─ [RWKV](https://www.bilibili.com/video/BV1yu41157zB)
|
||||
|
||||
[Open source TTS](https://www.bilibili.com/read/cv25192534) ─┘ │
|
||||
|
||||
│
|
||||
|
||||
│
|
||||
|
||||
┌──────────┬────┼───────┬─────────┐
|
||||
|
||||
Remote Android [Live2D](https://www.bilibili.com/video/BV1sx4y1d775/?vd_source=564eede213b9ddfa9a10f12e5350fd64) [UE](https://www.bilibili.com/read/cv25133736) [xuniren](https://www.bilibili.com/read/cv24997550) Remote PC
|
||||
|
||||
|
||||
|
||||
*Important: Communication interface between Fay (server) and digital human (client): ['ws://127.0.0.1:10002'](ws://127.0.0.1:10002) (connected)*
|
||||
|
||||
Message format: View [WebSocket.md](https://github.com/TheRamU/Fay/blob/main/WebSocket.md)
|
||||
|
||||

|
||||
|
||||
|
||||
|
||||
**代码结构**
|
||||
|
||||
```
|
||||
.
|
||||
|
||||
├── main.py # Program main entry
|
||||
├── fay_booter.py # Core boot module
|
||||
├── config.json # Controller configuration file
|
||||
├── system.conf # System configuration file
|
||||
├── ai_module
|
||||
│ ├── ali_nls.py # Aliyun Real-time Voice
|
||||
│ ├── ms_tts_sdk.py # Microsoft Text-to-Speech
|
||||
│ ├── nlp_lingju.py # Lingju Human-Machine Interaction - Natural Language Processing
|
||||
│ ├── xf_aiui.py # Xunfei Human-Machine Interaction - Natural Language Processing
|
||||
│ ├── nlp_gpt.py # GPT API integration
|
||||
│ ├── nlp_chatgpt.py # Reverse integration with chat.openai.com
|
||||
│ ├── nlp_yuan.py # Langchao. Yuan model integration
|
||||
│ ├── nlp_rasa.py # Preceding Rasa conversation management based on ChatGLM-6B (highly recommended)
|
||||
│ ├── nlp_VisualGLM.py # Integration with multimodal large language model VisualGLM-6B
|
||||
│ ├── nlp_rwkv.py # Offline integration with rwkv
|
||||
│ ├── nlp_rwkv_api.py # rwkv server API
|
||||
│ ├── yolov8.py # YOLOv8 object detection
|
||||
│ └── xf_ltp.py # Xunfei Sentiment Analysis
|
||||
├── bin # Executable file directory
|
||||
├── core # Digital Human Core
|
||||
│ ├── fay_core.py # Digital Human Core module
|
||||
│ ├── recorder.py # Recorder
|
||||
│ ├── tts_voice.py # Speech synthesis enumeration
|
||||
│ ├── authorize_tb.py # fay.db authentication table management
|
||||
│ ├── content_db.py # fay.db content table management
|
||||
│ ├── interact.py # Interaction (message) object
|
||||
│ ├── song_player.py # Music player (currently unavailable)
|
||||
│ └── wsa_server.py # WebSocket server
|
||||
├── gui # Graphical interface
|
||||
│ ├── flask_server.py # Flask server
|
||||
│ ├── static
|
||||
│ ├── templates
|
||||
│ └── window.py # Window module
|
||||
├── scheduler
|
||||
│ └── thread_manager.py # Scheduler manager
|
||||
├── utils # Utility modules
|
||||
│ ├── config_util.py
|
||||
│ ├── storer.py
|
||||
│ └── util.py
|
||||
└── test # All surprises
|
||||
|
||||
```
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
## **Upgrade Log**
|
||||
|
||||
**2023.07.21:**
|
||||
|
||||
+ Fixed the problem of generating a large amount of WS information before connecting digital humans;
|
||||
+ Add digital human (UE, Live2D, Xuniren) communication interface: real-time logs;
|
||||
+ Update digital human (UE, Live2D, Xuniren) communication interface: audio push.
|
||||
|
||||
**2023.07.21:**
|
||||
|
||||
+ Multiple updates for the merchandise version.
|
||||
|
||||
**2023.07.19:**
|
||||
+ Fixed the issue of remote voice recognition.
|
||||
+ Fixed the issue of occasional unresponsiveness during ASR (Automatic Speech Recognition).
|
||||
+ Removed the singing command.
|
||||
|
||||
**2023.07.14:**
|
||||
|
||||
+ Fixed Linux and macOS runtime errors.
|
||||
+ Fixed the issue of being unable to continue execution due to lip-sync errors.
|
||||
+ Provided an integration solution for RWKV.
|
||||
|
||||
**2023.07.12:**
|
||||
|
||||
+ Fixed an issue in Assistant Edition where text input does not read persona responses.
|
||||
+ Fixed an issue in Assistant Edition where text input does not read QA responses.
|
||||
+ Enhanced microphone stability.
|
||||
|
||||
**2023.07.05:**
|
||||
|
||||
+ Fixed a sound playback issue caused by the inability to run the lip-sync algorithm.
|
||||
|
||||
**2023.06:**
|
||||
|
||||
+ Refactored NLP module management logic for easier extension.
|
||||
+ Split GPT into ChatGPT and GPT, replaced with a new GPT interface, and added the ability to configure proxy servers separately.
|
||||
+ Specified the version of the YOLOv8 package to resolve YOLO compatibility issues.
|
||||
+ Fixed self-talk bug and receiving multiple messages to be processed bug.
|
||||
+ Integrated Lingju NLP API (supporting GPT3.5 and multiple applications).
|
||||
+ UI corrections.
|
||||
+ Integrated local lip-sync algorithm.
|
||||
+ Resolved compatibility issues with multi-channel microphones.
|
||||
+ Refactored fay_core.py and fay_booter.py code.
|
||||
+ UI layout adjustments.
|
||||
+ Restored sound selection.
|
||||
+ Fixed logic for displaying "Thinking..."
|
||||
|
||||
|
||||
## **Installation Instructions**
|
||||
|
||||
|
||||
### **Environment**
|
||||
- Python 3.9、3.10
|
||||
- Windows、macos、linux
|
||||
|
||||
### **Installing Dependencies**
|
||||
|
||||
```shell
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
### **Configuring Application Key**
|
||||
+ View [API Modules](#ai-modules)
|
||||
+ Browse the link, register, and create an application. Fill in the application key in `./system.conf`
|
||||
|
||||
### **Starting**
|
||||
|
||||
Starting Fay Controller
|
||||
|
||||
```shell
|
||||
python main.py
|
||||
```
|
||||
|
||||
|
||||
### **API Modules**
|
||||
|
||||
Application Key needs to be filled in before starting
|
||||
|
||||
| File | Description | Link |
|
||||
|-----------------------------|----------------------------------------------------------|--------------------------------------------------------------|
|
||||
| ./ai_module/ali_nls.py | Real-time Speech Recognition (*Optional*) | https://ai.aliyun.com/nls/trans |
|
||||
| ./ai_module/ms_tts_sdk.py | Microsoft Text-to-Speech with Emotion (*Optional*) | https://azure.microsoft.com/zh-cn/services/cognitive-services/text-to-speech/ |
|
||||
| ./ai_module/xf_ltp.py | Xunfei Sentiment Analysis(*Optional*) | https://www.xfyun.cn/service/emotion-analysis |
|
||||
| ./utils/ngrok_util.py | ngrok.cc External Network Penetration (optional) | http://ngrok.cc |
|
||||
| ./ai_module/nlp_lingju.py | Lingju NLP API (supports GPT3.5 and multiple applications)(*Optional*) | https://open.lingju.ai Contact customer service to enable GPT3.5 access |
|
||||
| ./ai_module/yuan_1_0.py | Langchao Yuan Model (*Optional*) | https://air.inspur.com/ |
|
||||
|
||||
|
||||
## **Instructions for Use**
|
||||
|
||||
|
||||
### **Instructions for Use**
|
||||
|
||||
+ Voice Assistant: Fay Controller (with microphone input source enabled and panel playback enabled).
|
||||
+ Remote Voice Assistant: Fay Controller (with panel playback disabled) + Remote device integration.
|
||||
+ Digital Human Interaction: Fay Controller (with microphone input source enabled, panel playback disabled, and personality Q&A filled) + Digital Human.
|
||||
+ Jarvis, Her: Join us to complete the experience together.
|
||||
|
||||
|
||||
### **Voice Commands**
|
||||
|
||||
| Shut down | Mute | Unmute |
|
||||
| ------------------------- | -------------------------- | ------------------------------------------------------------ |
|
||||
| Shut down, Goodbye, Go away | Mute, Be quiet, I want silence | Unmute, Where are you, You can speak now |
|
||||
|
||||
|
||||
|
||||
### **For business inquiries**
|
||||
|
||||
**business QQ **: 467665317
|
||||
|
||||
|
||||
|
||||
|
||||
|
38
WebSocket.md
38
WebSocket.md
@ -36,6 +36,7 @@
|
||||
"Data": {
|
||||
"Key": "audio",
|
||||
"Value": "C:\samples\sample-1.wav",
|
||||
"Text" : "很高兴见到你"
|
||||
"Lips":[{"Lip": "sil", "Time": 180}, {"Lip": "FF", "Time": 144}],
|
||||
"Time": 10,
|
||||
"Type": "interact"
|
||||
@ -51,6 +52,7 @@
|
||||
| Data.Time | 音频时长 (秒) | float | |
|
||||
| Data.Type | 发言类型 | str | interact/script |
|
||||
| Data.Lips | 视音素 | array | |
|
||||
| Data.text | 文本 | str | |
|
||||
|
||||
|
||||
|
||||
@ -70,6 +72,42 @@
|
||||
|
||||
|
||||
|
||||
| 参数 | 描述 | 类型 | 范围 |
|
||||
| ---------- | ---------------- | ----- | --------------- |
|
||||
| Data.text | 文本 | str | |
|
||||
|
||||
### 发送询问文字
|
||||
|
||||
```json
|
||||
{
|
||||
"Topic": "Unreal",
|
||||
"Data": {
|
||||
"Key": "question",
|
||||
"Value": "很高兴见到你"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
|
||||
| 参数 | 描述 | 类型 | 范围 |
|
||||
| ---------- | ---------------- | ----- | --------------- |
|
||||
| Data.text | 文本 | str | |
|
||||
|
||||
### 发送日志文字
|
||||
|
||||
```json
|
||||
{
|
||||
"Topic": "Unreal",
|
||||
"Data": {
|
||||
"Key": "log",
|
||||
"Value": "很高... "
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
|
||||
| 参数 | 描述 | 类型 | 范围 |
|
||||
| ---------- | ---------------- | ----- | --------------- |
|
||||
| Data.text | 文本 | str | |
|
@ -83,10 +83,16 @@ class ALiNls:
|
||||
self.done = True
|
||||
self.finalResults = data['payload']['result']
|
||||
wsa_server.get_web_instance().add_cmd({"panelMsg": self.finalResults})
|
||||
if not cfg.config["interact"]["playSound"]: # 非展板播放
|
||||
content = {'Topic': 'Unreal', 'Data': {'Key': 'log', 'Value': self.finalResults}}
|
||||
wsa_server.get_instance().add_cmd(content)
|
||||
self.__on_msg()
|
||||
elif name == 'TranscriptionResultChanged':
|
||||
self.finalResults = data['payload']['result']
|
||||
wsa_server.get_web_instance().add_cmd({"panelMsg": self.finalResults})
|
||||
if not cfg.config["interact"]["playSound"]: # 非展板播放
|
||||
content = {'Topic': 'Unreal', 'Data': {'Key': 'log', 'Value': self.finalResults}}
|
||||
wsa_server.get_instance().add_cmd(content)
|
||||
self.__on_msg()
|
||||
|
||||
except Exception as e:
|
||||
|
@ -35,6 +35,9 @@ class FunASR:
|
||||
self.done = True
|
||||
self.finalResults = message
|
||||
wsa_server.get_web_instance().add_cmd({"panelMsg": self.finalResults})
|
||||
if not cfg.config["interact"]["playSound"]: # 非展板播放
|
||||
content = {'Topic': 'Unreal', 'Data': {'Key': 'log', 'Value': self.finalResults}}
|
||||
wsa_server.get_instance().add_cmd(content)
|
||||
self.__on_msg()
|
||||
|
||||
except Exception as e:
|
||||
|
@ -35,7 +35,7 @@ from ai_module import nlp_gpt
|
||||
from ai_module import nlp_yuan
|
||||
from ai_module import yolov8
|
||||
from ai_module import nlp_VisualGLM
|
||||
|
||||
from ai_module import nlp_lingju
|
||||
|
||||
import platform
|
||||
if platform.system() == "Windows":
|
||||
@ -43,7 +43,7 @@ if platform.system() == "Windows":
|
||||
sys.path.append("test/ovr_lipsync")
|
||||
from test_olipsync import LipSyncGenerator
|
||||
|
||||
from ai_module import nlp_lingju
|
||||
|
||||
|
||||
modules = {
|
||||
"nlp_yuan": nlp_yuan,
|
||||
@ -157,6 +157,9 @@ class FeiFei:
|
||||
song_player.play()
|
||||
self.playing = False
|
||||
wsa_server.get_web_instance().add_cmd({"panelMsg": ""})
|
||||
if not cfg.config["interact"]["playSound"]: # 非展板播放
|
||||
content = {'Topic': 'Unreal', 'Data': {'Key': 'log', 'Value': ""}}
|
||||
wsa_server.get_instance().add_cmd(content)
|
||||
|
||||
#检查是否命中指令或q&a
|
||||
def __get_answer(self, interleaver, text):
|
||||
@ -164,9 +167,18 @@ class FeiFei:
|
||||
#指令
|
||||
keyword = qa_service.question('command',text)
|
||||
if keyword is not None:
|
||||
if keyword == "stop":
|
||||
if keyword == "playSong":
|
||||
MyThread(target=self.__play_song).start()
|
||||
wsa_server.get_web_instance().add_cmd({"panelMsg": ""})
|
||||
if not cfg.config["interact"]["playSound"]: # 非展板播放
|
||||
content = {'Topic': 'Unreal', 'Data': {'Key': 'log', 'Value': ""}}
|
||||
wsa_server.get_instance().add_cmd(content)
|
||||
elif keyword == "stop":
|
||||
fay_booter.stop()
|
||||
wsa_server.get_web_instance().add_cmd({"panelMsg": ""})
|
||||
if not cfg.config["interact"]["playSound"]: # 非展板播放
|
||||
content = {'Topic': 'Unreal', 'Data': {'Key': 'log', 'Value': ""}}
|
||||
wsa_server.get_instance().add_cmd(content)
|
||||
wsa_server.get_web_instance().add_cmd({"liveState": 0})
|
||||
elif keyword == "mute":
|
||||
self.muting = True
|
||||
@ -175,6 +187,9 @@ class FeiFei:
|
||||
MyThread(target=self.__say, args=['interact']).start()
|
||||
time.sleep(0.5)
|
||||
wsa_server.get_web_instance().add_cmd({"panelMsg": ""})
|
||||
if not cfg.config["interact"]["playSound"]: # 非展板播放
|
||||
content = {'Topic': 'Unreal', 'Data': {'Key': 'log', 'Value': ""}}
|
||||
wsa_server.get_instance().add_cmd(content)
|
||||
elif keyword == "unmute":
|
||||
self.muting = False
|
||||
return None
|
||||
@ -186,6 +201,9 @@ class FeiFei:
|
||||
break
|
||||
config_util.save_config(config_util.config)
|
||||
wsa_server.get_web_instance().add_cmd({"panelMsg": ""})
|
||||
if not cfg.config["interact"]["playSound"]: # 非展板播放
|
||||
content = {'Topic': 'Unreal', 'Data': {'Key': 'log', 'Value': ""}}
|
||||
wsa_server.get_instance().add_cmd(content)
|
||||
return "NO_ANSWER"
|
||||
|
||||
# 人设问答
|
||||
@ -217,21 +235,34 @@ class FeiFei:
|
||||
person_count, stand_count, sit_count = fay_eyes.get_counts()
|
||||
if person_count < 1: #看不到人,不互动
|
||||
wsa_server.get_web_instance().add_cmd({"panelMsg": "看不到人,不互动"})
|
||||
if not cfg.config["interact"]["playSound"]: # 非展板播放
|
||||
content = {'Topic': 'Unreal', 'Data': {'Key': 'log', 'Value': "看不到人,不互动"}}
|
||||
wsa_server.get_instance().add_cmd(content)
|
||||
continue
|
||||
|
||||
answer = self.__get_answer(interact.interleaver, self.q_msg)#确定是否命中指令或q&a
|
||||
if(self.muting): #静音指令正在执行
|
||||
wsa_server.get_web_instance().add_cmd({"panelMsg": "静音指令正在执行,不互动"})
|
||||
if not cfg.config["interact"]["playSound"]: # 非展板播放
|
||||
content = {'Topic': 'Unreal', 'Data': {'Key': 'log', 'Value': "静音指令正在执行,不互动"}}
|
||||
wsa_server.get_instance().add_cmd(content)
|
||||
continue
|
||||
|
||||
contentdb = Content_Db()
|
||||
contentdb.add_content('member','speak',self.q_msg)
|
||||
wsa_server.get_web_instance().add_cmd({"panelReply": {"type":"member","content":self.q_msg}})
|
||||
if not config_util.config["interact"]["playSound"]: # 非展板播放
|
||||
content = {'Topic': 'Unreal', 'Data': {'Key': 'question', 'Value': self.q_msg}}
|
||||
wsa_server.get_instance().add_cmd(content)
|
||||
|
||||
text = ''
|
||||
textlist = []
|
||||
self.speaking = True
|
||||
if answer is None:
|
||||
wsa_server.get_web_instance().add_cmd({"panelMsg": "思考中..."})
|
||||
if not cfg.config["interact"]["playSound"]: # 非展板播放
|
||||
content = {'Topic': 'Unreal', 'Data': {'Key': 'log', 'Value': "思考中..."}}
|
||||
wsa_server.get_instance().add_cmd(content)
|
||||
text,textlist = determine_nlp_strategy(1,self.q_msg)
|
||||
elif answer != 'NO_ANSWER': #语音内容没有命中指令,回复q&a内容
|
||||
text = answer
|
||||
@ -245,6 +276,9 @@ class FeiFei:
|
||||
wsa_server.get_web_instance().add_cmd({"panelReply": {"type":"fay","content":textlist[i]['text']}})
|
||||
i+= 1
|
||||
wsa_server.get_web_instance().add_cmd({"panelMsg": self.a_msg})
|
||||
if not cfg.config["interact"]["playSound"]: # 非展板播放
|
||||
content = {'Topic': 'Unreal', 'Data': {'Key': 'log', 'Value': self.a_msg}}
|
||||
wsa_server.get_instance().add_cmd(content)
|
||||
self.last_speak_data = self.a_msg
|
||||
MyThread(target=self.__say, args=['interact']).start()
|
||||
|
||||
@ -272,7 +306,7 @@ class FeiFei:
|
||||
def __send_mood(self):
|
||||
while self.__running:
|
||||
time.sleep(3)
|
||||
if not self.sleep and not config_util.config["interact"]["playSound"]:
|
||||
if not self.sleep and not config_util.config["interact"]["playSound"] and wsa_server.get_instance().isConnect:
|
||||
content = {'Topic': 'Unreal', 'Data': {'Key': 'mood', 'Value': self.mood}}
|
||||
wsa_server.get_instance().add_cmd(content)
|
||||
|
||||
@ -365,7 +399,7 @@ class FeiFei:
|
||||
self.__play_sound(file_url)
|
||||
else:#发送音频给ue和socket
|
||||
#推送ue
|
||||
content = {'Topic': 'Unreal', 'Data': {'Key': 'audio', 'Value': os.path.abspath(file_url), 'Time': audio_length, 'Type': say_type}}
|
||||
content = {'Topic': 'Unreal', 'Data': {'Key': 'audio', 'Value': os.path.abspath(file_url), 'Text': self.a_msg, 'Time': audio_length, 'Type': say_type}}
|
||||
#计算lips
|
||||
if platform.system() == "Windows":
|
||||
try:
|
||||
@ -396,6 +430,9 @@ class FeiFei:
|
||||
|
||||
time.sleep(audio_length + 0.5)
|
||||
wsa_server.get_web_instance().add_cmd({"panelMsg": ""})
|
||||
if not cfg.config["interact"]["playSound"]: # 非展板播放
|
||||
content = {'Topic': 'Unreal', 'Data': {'Key': 'log', 'Value': ""}}
|
||||
wsa_server.get_instance().add_cmd(content)
|
||||
if config_util.config["interact"]["playSound"]:
|
||||
util.log(1, '结束播放!')
|
||||
self.speaking = False
|
||||
@ -441,6 +478,9 @@ class FeiFei:
|
||||
self.playing = False
|
||||
self.sp.close()
|
||||
wsa_server.get_web_instance().add_cmd({"panelMsg": ""})
|
||||
if not cfg.config["interact"]["playSound"]: # 非展板播放
|
||||
content = {'Topic': 'Unreal', 'Data': {'Key': 'log', 'Value': ""}}
|
||||
wsa_server.get_instance().add_cmd(content)
|
||||
if self.deviceConnect is not None:
|
||||
self.deviceConnect.close()
|
||||
self.deviceConnect = None
|
||||
|
@ -92,6 +92,9 @@ class Recorder:
|
||||
self.processing = False
|
||||
self.dynamic_threshold = self.__get_history_percentage(30)
|
||||
wsa_server.get_web_instance().add_cmd({"panelMsg": ""})
|
||||
if not cfg.config["interact"]["playSound"]: # 非展板播放
|
||||
content = {'Topic': 'Unreal', 'Data': {'Key': 'log', 'Value': ""}}
|
||||
wsa_server.get_instance().add_cmd(content)
|
||||
|
||||
|
||||
def __record(self):
|
||||
|
BIN
images/kzq.jpg
BIN
images/kzq.jpg
Binary file not shown.
Before Width: | Height: | Size: 418 KiB After Width: | Height: | Size: 406 KiB |
@ -1,3 +1,6 @@
|
||||
import pyaudio
|
||||
audio = pyaudio.PyAudio()
|
||||
print(audio.get_device_count())
|
||||
for i in range(audio.get_device_count()):
|
||||
devInfo = audio.get_device_info_by_index(i)
|
||||
if devInfo['hostApi'] == 0:
|
||||
print(devInfo)
|
@ -6,6 +6,7 @@ import time
|
||||
|
||||
from core import wsa_server
|
||||
from scheduler.thread_manager import MyThread
|
||||
from utils import config_util
|
||||
|
||||
LOGS_FILE_URL = "logs/log-" + time.strftime("%Y%m%d%H%M%S") + ".log"
|
||||
|
||||
@ -33,6 +34,9 @@ def printInfo(level, sender, text, send_time=-1):
|
||||
print(logStr)
|
||||
if level >= 3:
|
||||
wsa_server.get_web_instance().add_cmd({"panelMsg": text})
|
||||
if not config_util.config["interact"]["playSound"]: # 非展板播放
|
||||
content = {'Topic': 'Unreal', 'Data': {'Key': 'log', 'Value': text}}
|
||||
wsa_server.get_instance().add_cmd(content)
|
||||
MyThread(target=__write_to_file, args=[logStr]).start()
|
||||
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user