跨平台Desk
PoixsDesk is a remote desktop/cloud computer system based on WebRTC technology, supporting remote access and control of Windows desktops through web browsers. The system implements low-latency screen sharing, audio/video transmission, and input control functionality.
┌─────────────────────────────────────────────────────────────────┐
│ PoixsDesk System Architecture │
└─────────────────────────────────────────────────────────────────┘
┌──────────────────┐ ┌──────────────────┐
│ Web Client │ │ Desktop App │
│ │ │ │
│ - Browser Player│ ◄──────►│ - Screen Capture│
│ - Input Events │ │ - A/V Encoding │
│ - Signaling │ │ - Input Control │
└──────────────────┘ └──────────────────┘
│ │
│ │
▼ ▼
┌──────────────────────────────────────────────┐
│ Signaling Server (WebSocket) │
│ │
│ - SDP Exchange │
│ - ICE Candidate Exchange │
│ - Room Management │
└──────────────────────────────────────────────┘
│ │
│ │
▼ ▼
┌──────────────────┐ ┌──────────────────┐
│ WebRTC Media │ │ WebRTC Media │
│ Transport │ ◄──────►│ Transport │
│ (SRTP/SRTP) │ │ (SRTP/SRTP) │
│ │ │ │
│ - Video Stream │ │ - Video Stream │
│ - Audio Stream │ │ - Audio Stream │
│ - Data Channel │ │ - Data Channel │
└──────────────────┘ └──────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ PoixsDesk Module Architecture │
└─────────────────────────────────────────────────────────────┘
┌───────────────────────────────────────────────────────────┐
│ PoixsDesk (Main Program) │
│ - main.cpp: Program entry point │
│ - Session monitoring and lifecycle management │
└───────────────────────────────────────────────────────────┘
│
┌──────────────────┼──────────────────┐
│ │ │
▼ ▼ ▼
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
│ libcommon │ │ libdevice │ │ Win │
│ (Common Lib) │ │ (Device Ctrl)│ │ (GUI App) │
└───────────────┘ └───────────────┘ └───────────────┘
│ │ │
│ │ │
┌────┴────┐ ┌────┴────┐ ┌────┴────┐
│ │ │ │ │ │
┌──▼──┐ ┌──▼──┐ ┌──▼──┐ ┌──▼──┐ ┌──▼──┐ ┌──▼──┐
│Client│ │RTC │ │Input│ │Desk │ │Window│ │Dialog│
│ │ │Pub │ │Dev │ │Ctrl │ │ UI │ │ UI │
└─────┘ └─────┘ └─────┘ └─────┘ └─────┘ └─────┘
Responsibilities:
Class Structure:
class crtc_client : public crtc_publisher::clistener
{
// State management
ERtc_Type m_status;
bool m_stoped;
// RTC publisher
std::unique_ptr<crtc_publisher> m_rtc_publisher;
// Signaling related
std::string m_room_name;
std::string m_user_name;
// Methods
void init(int gpu_index);
void Loop(const std::string& rtc_url);
void destroy();
};
State Machine:
ERtc_None → ERtc_WebSocket_Init → ERtc_WebSocket →
ERtc_WebSocket_Wait → ERtc_WebSocket_Close → ERtc_Destory → ERtc_Exit
Responsibilities:
Class Structure:
class crtc_publisher : public PeerConnectionObserver,
public CreateSessionDescriptionObserver
{
// PeerConnection
rtc::scoped_refptr<PeerConnectionInterface> peer_connection_;
// Audio/Video source
rtc::scoped_refptr<VideoTrackSourceInterface> video_track_source_;
// Data channel
std::unique_ptr<cdata_channel> m_data_channel_ptr;
// Callback interface
clistener* m_callback_ptr;
// Methods
void create_offer();
void set_remoter_description(const std::string& sdp);
void InitializePeerConnection();
};
Responsibilities:
Class Structure:
class cinput_device
{
// Data channel
rtc::scoped_refptr<DataChannelInterface> dataChannel;
// Message queue
std::list<rtc::CopyOnWriteBuffer> m_messages;
std::mutex m_messages_lock;
// Worker thread
std::thread m_work_thread;
// Methods
void OnMessage(const webrtc::DataBuffer& buffer);
void OnMouseMove(...);
void OnKeyDown(...);
// ... other input event handlers
};
Input Event Flow:
Remote Event → DataChannel Receive → Message Queue → Worker Thread Process →
Coordinate Transform → Windows API → Local Input
Responsibilities:
Responsibilities:
Main Functions:
// Mouse control
void abs_mouse(FEvent& input, float x, float y); // Absolute mouse move
void move_mouse(FEvent& input, int deltaX, int deltaY); // Relative mouse move
void button_mouse(FEvent& input, int32_t posX, int32_t posY, int button, bool release); // Mouse button
void scroll(FEvent& input, int distance); // Vertical scroll
void hscroll(FEvent& input, int distance); // Horizontal scroll
// Keyboard control
void keyboard_update(FEvent& input, uint16_t modcode, bool release, uint8_t flags);
// Input sending
void send_input(INPUT& i);
Coordinate Mapping:
virtual_x = pixel_x * (65535.0 / screen_width)Responsibilities:
Key Features:
Responsibilities:
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│Desktop Capture│─────►│Video Encoding│─────►│WebRTC Transport│
│ │ │ │ │ │
│ DXGI/WGC │ │ H.264/VP9 │ │ RTP/SRTP │
└──────────────┘ └──────────────┘ └──────────────┘
│
▼
┌──────────────┐
│ Web Client │
│ Video Player │
└──────────────┘
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│Audio Capture │─────►│Audio Encoding│─────►│WebRTC Transport│
│ │ │ │ │ │
│ Windows Audio│ │ Opus/G.711 │ │ RTP/SRTP │
└──────────────┘ └──────────────┘ └──────────────┘
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Web Client │─────►│Signaling/Data│─────►│Input Device │
│ │ │ Channel │ │ Handler │
│Mouse/Keyboard│ │ JSON/Binary │ │Coord Transform│
└──────────────┘ └──────────────┘ └──────────────┘
│
▼
┌──────────────┐
│ Windows API │
│ SendInput │
└──────────────┘
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Desktop │ │Signaling Server│ │ Web Client │
│ │ │ │ │ │
│ Offer SDP │─────►│ │◄─────│Signaling Conn│
│ │ │ Signaling │ │ │
│ICE Candidate │─────►│ Forward │◄─────│ Answer SDP │
│ │ │ │ │ │
│ Answer SDP │◄─────│ │─────►│ICE Candidate │
└──────────────┘ └──────────────┘ └──────────────┘
{
"type": "offer",
"room_name": "room123",
"user_name": "user1",
"sdp": "v=0\r\no=- ..."
}
{
"type": "answer",
"room_name": "room123",
"user_name": "user2",
"sdp": "v=0\r\no=- ..."
}
{
"type": "ice-candidate",
"room_name": "room123",
"user_name": "user1",
"candidate": {
"candidate": "candidate:...",
"sdpMLineIndex": 0,
"sdpMid": "0"
}
}
┌──────────┬──────────┬──────────┬──────────┐
│Msg Type │Key Flag │Coord X │Coord Y │
│(1 byte) │(1 byte) │(2 bytes) │(2 bytes) │
└──────────┴──────────┴──────────┴──────────┘
{
"type": "mouse_move",
"x": 100,
"y": 200,
"deltaX": 10,
"deltaY": 20
}
pixel = normalized * screen_dimensionsyncThreadDesktop() to synchronize thread to input desktopVK_TO_SCANCODE_MAP to directly lookup scan codeMapVirtualKey() for dynamic conversionPoixsDesk/
├── CMakeLists.txt # Main build file
├── README.md # Project readme
├── ARCHITECTURE_EN.md # Architecture documentation (this document)
├── libcommon/ # Common library
│ ├── client.h/cpp # RTC client
│ ├── crtc_publisher.h/cpp # RTC publisher
│ ├── cinput_device.h/cpp # Input device
│ ├── cdesktop_capture.h/cpp # Desktop capture
│ ├── cdata_channel.h/cpp # Data channel
│ └── ...
├── libdevice/ # Device control library
│ └── window/
│ ├── device.h/cpp # Input device control
│ └── misc.h/cpp # Helper functions
├── PoixsDesk/ # Main program
│ └── main.cpp # Program entry point
├── Win/ # Windows GUI program
│ ├── LiveWin32.cpp # Main window
│ └── Dlg*.cpp # Dialogs
├── Tools/ # Tool programs
│ ├── PoisxDeskService.cpp # Windows service
│ └── audio.cpp # Audio processing
├── www/ # Web frontend
│ ├── index.html # Home page
│ ├── player.html # Player page
│ └── scripts/ # JavaScript scripts
└── build/ # Build directory
# 1. Create build directory
mkdir build
cd build
# 2. Configure CMake (need to specify WebRTC path)
cmake .. -DWebRTC_ROOT=/path/to/webrtc
# 3. Build
cmake --build . --config Release
# 4. Run
./PoixsDesk/PoixsDesk.exe ws://localhost:8080/rtc
cdesktop_capture moduleVideoEncoderInterfacecrtc_clientIssue 1: Input Events Invalid
Issue 2: Video Not Displaying
Issue 3: High Latency
Document Version: 1.0
Last Updated: 2025-01-XX
Author: chensong