IOM Releases Its Second Synthetic Dataset From Trafficking Victim Case Records Generated With Differential Privacy And AI From Microsoft

Researchers at Microsoft are committed to researching ways technology may help the world’s most marginalized peoples improve their human rights situations. Their expertise spans human-computer interaction, data science, and the social sciences. The research team collaborates with community, governmental, and nongovernmental groups to develop available technologies that allow scalable answers to such issues.

International Organization for Migration (IOM) is a United Nations agency that helps migrants and survivors of human trafficking. By offering assistance to governments and migrants in its 175 member nations, IOM strives to promote humanitarian and orderly migration.

IOM has released its second synthetic dataset, derived from case records of victims of trafficking, using software built by Microsoft researchers. This dataset is the first public dataset to depict victim-perpetrator interactions. To further facilitate data sharing and rigorous research while respecting privacy and civil liberties, the synthetic dataset is the first to be developed with differential privacy, offering an extra security assurance for repeated data releases. The new data release results from years of cooperation between Microsoft and the IOM. It promotes the secure sharing of victim case information in ways that may influence collaborative action within the anti-trafficking community. The CTDC data hub (Counter-Trafficking Data Collaborative) is the first worldwide gateway for human trafficking case data. Its creation was motivated by a shared commitment to improving that hub’s security and usefulness. Since then, IOM and Microsoft have worked together to enhance the use of information on victims and survivors, including their descriptions of traffickers, in the fight against human trafficking.

This work has resulted in a new user interface offered as a public utility web application, allowing users to aggregate and synthesize private data without sending any of it outside the user’s local web browser.

Importance of data privacy while working with vulnerable populations 

All precautions must be taken to prevent traffickers from identifying victims of trafficking in published databases. People’s personal information must be kept confidential to avoid further traumatization or social exclusion. The over- or under-reporting of a certain trend in victim instances by a privacy approach might mislead decision-makers into improperly allocating limited resources, preventing them from solving the underlying problem.

IOM and Microsoft’s collaboration was founded on rather than redacting sensitive data to achieve privacy. It could be possible to produce synthetic datasets that properly capture the structure and statistics of underlying sensitive information while staying private by design. In light of this guiding principle and the necessity of providing case count breakdowns by various attribute combinations (e.g., age range, gender, nationality), a method was developed whereby synthetic data matching all short combinations of case attributes would be released alongside privacy-preserving counts of cases. Therefore, the compiled information is useful for assessing the quality of synthetic data and recovering precise numbers for official reporting.

Datasets aggregated in this way maintain the same level of privacy since differentially private data has the feature that additional processing cannot exacerbate privacy loss which allowed the team to adapt their preexisting method of data synthesis, which involves synthesizing records by sampling sets of qualities until all attributes were covered, to extrapolate these noisy reported attribute combinations into complete, differentially-private synthetic records. This yields accurate aggregate data for official reporting, synthetic data for engaging exploration and machine learning, and differential privacy assurances that provide protection even over multiple overlapping data releases, all of which are essential for IOM and similar organizations to establish a strong data ecosystem against human trafficking and other human rights violations.

Stakeholders may improve their understanding of susceptibility risk factors and implement efficient counter-trafficking actions when they have access to precise yet anonymous patterns of attributes describing victim-perpetrator connections.

What’s next?

To make the solution available to other businesses and government entities, Microsoft and IOM have made it open to the public. It may be used by any interested party to collect and share personal information safely.

Together with the UN Office on Drugs and Crime (UNODC), IOM has been developing guidelines and recommendations to assist countries in generating high-quality administrative data. They have also been working with the International Labor Organization (ILO) of the United Nations to compile a bibliography of studies focusing on the effects of trafficking on public policy. To encourage governments and frontline anti-trafficking organizations to share data securely, IOM is developing an online course that will include a session with instructions on synthetic data.

Check out the Reference Article. All Credit For This Research Goes To Researchers on This Project. Also, don’t forget to join our Reddit page and discord channel, where we share the latest AI research news, cool AI projects, and more.

Dhanshree Shenwai is a Computer Science Engineer and has a good experience in FinTech companies covering Financial, Cards & Payments and Banking domain with keen interest in applications of AI. She is enthusiastic about exploring new technologies and advancements in today’s evolving world making everyone's life easy.

🐝 Join the Fastest Growing AI Research Newsletter Read by Researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others...