The precise and accurate tracking of user interactions stands as a cornerstone for advancing the capabilities of computer agents. This task plays a pivotal role in developing and training intelligent systems. The effectiveness of these systems, which are intended to mimic cognitive processes and perform duties independently, depends on the careful examination and systematic recording of user interactions.
The Duck AI researchers have developed DuckTrack to record various inputs accurately so that computer agents can be properly trained on the collected data. DuckTrack offers a synchronized collection of mouse, keyboard, screen video, and audio data through a user-friendly desktop app compatible with major operating systems.
Also, DuckTrack has initiated a Community Data Collection Initiative. This open-source effort invites contributors to participate in the collection of diverse computer interaction data. DuckTrack runs smoothly on all major operating systems and is created in Python. DuckTrack’s Feature Overview showcases its capability for precise and accurate recording and playback of mouse and keyboard actions. The researchers said that integrating screen recordings with OBS further enhances its versatility.
For DeepTruck, the Structural Similarity Index (SSIM) consistently exceeds 0.9 in drawing tasks. Each event is recorded with a low error margin of 0.03ms ± 0.4ms, surpassing the accuracy of existing trackers in the market. DuckTrack’s commitment to excellence is evident in its performance metrics, making it a reliable choice for users seeking top-tier tracking and playback solutions.
But DuckTrack has certain limitations, too. Realistically, mimicking double or triple clicks during playback poses a challenge, impacting the accuracy of these actions. Furthermore, DuckTrack cannot record trackpad gestures and has limitations when capturing inputs in scenarios involving raw information, like gaming. The developers are actively working to address these limitations and improve DuckTrack’s capabilities by continuing to engage the community.
The researchers tested DuckTrack on different systems, including the M2 Pro MBP 14 running macOS Sonoma 14.0 and the Intel i7-10510U System76 Lemur Pro 9 running PopOS! 22.10 (Ubuntu-based) and Windows 10 22H2. They tested DuckTrack on the ReCAPTCHA task, a proxy for human-like movement, and concluded that it exhibited a 100% success rate across ten trials. While hardware variations may slightly influence performance, the consistency in accuracy across operating systems underscores DuckTrack’s reliability.
The researchers will announce the detailed guidelines on contributing and setting up data collection shortly, encouraging a collective effort to refine and evolve DuckTrack’s functionality.
DuckTrack proves to be a revolutionary in gathering data on computer interactions. With its dedication to precision, continuous community involvement, and emphasis on improvement, DuckTrack is a top choice for people and businesses looking for excellent playback and tracking features. DuckTrack is paving the way for a more sophisticated and seamless multimodal computer interaction experience as it develops.
Download the pre-built application for your system here.
Check out the Blog and Github. All credit for this research goes to the researchers of this project. Also, don’t forget to join our 33k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.
Rachit Ranjan is a consulting intern at MarktechPost . He is currently pursuing his B.Tech from Indian Institute of Technology(IIT) Patna . He is actively shaping his career in the field of Artificial Intelligence and Data Science and is passionate and dedicated for exploring these fields.