Unlock your hospital's data for research — safely.
Saudi Arabia is investing billions in health AI and genomics under Vision 2030. But PDPL requires patient data to be anonymised before it leaves the wall — to researchers, AI vendors, or partners abroad. We make that step fast, accurate, and auditable, running entirely on your own infrastructure.
Saudi Arabia's Personal Data Protection Law (PDPL), in force since September 2023, classifies all health data as sensitive. Sharing it with researchers, AI vendors, or partners abroad without proper de-identification carries fines of up to SAR 3 million and, for wilful breach, custodial sentences.
Vision 2030 commits over USD 65 billion to digital health, genomics, and clinical AI by 2030. Hospitals are expected to feed national datasets, partner with AI labs, and publish research — yet most have no operational way to scrub Arabic clinical text at scale.
The result is a quiet bottleneck: data sits in EHRs, projects stall in legal review, and external collaborators wait. We built VeilHealth to remove this single, narrow blockage.
Maximum PDPL penalty
Per breach involving sensitive personal data. Custodial sentences apply for wilful disclosure.
Vision 2030 health-tech commitment
Across genomics, AI, and digital infrastructure programmes — all upstream of de-identified data.
Current Arabic recall, MSA
Honest baseline. Improving monthly. We always recommend a human-in-the-loop review pass.
Leave your network
Runs entirely on your infrastructure — VM, container, or air-gapped appliance. No cloud egress, ever.
VeilHealth ingests free-text notes, scanned PDFs, structured fields, and DICOM metadata. It detects and redacts personally identifying entities — patient names, national IDs, MRNs, phone numbers, addresses, dates of admission and discharge, family relations — across both Modern Standard Arabic and English, including dialectal spellings common in Gulf clinical notes.
Every redaction event writes to a tamper-evident log. An optional encrypted re-identification map, accessible only to roles you nominate, allows authorised staff to relink records when a regulator, IRB, or treating physician requires it.
Input · raw before
Output · redacted after
19 PII categories, two languages.
Names, IDs, MRNs, contact details, dates, addresses, providers, family members. Tuned for MSA and Gulf clinical idioms.
PDFs, free text, DICOM, FHIR.
OCR for scanned Arabic forms. JSON + DOCX + PDF in, same out, with structure preserved.
Tamper-evident audit trail.
Every detection, redaction, and re-identification request is logged and hashed. Exportable to your SIEM.
Open source, on your kit.
Apache-2.0 core. Deploys via Docker, Kubernetes, or RHEL appliance. No telemetry, no phone-home.
Three steps. No data leaves the wall.
Upload or stream.
Drop a folder, a file, or pipe a live HL7 / FHIR feed into the connector. Files never leave your network — VeilHealth runs as a container alongside your existing systems.
Identify, redact, log.
19 entity classes recognised across Arabic + English. Each redaction writes to a hashed audit ledger with the analyst's identity, timestamp, and policy rule that triggered it.
De-identified out.
Receive the clean dataset, plus an encrypted re-identification map kept on your premises. Authorised staff with a key — and a logged reason — can relink when a regulator or IRB requires it.
Built for the work that doesn't ship today.
Send IRB-approved cohorts to academic partners.
Run the redaction across thousands of patient records, attach the de-identification certificate to the data transfer agreement, and ship without a six-month legal review.
Hand a vendor real data — without handing them PHI.
Whether the vendor is a domestic startup or a global cloud, you keep the original records and a re-identification key. They get only what they need to train.
Demonstrate de-identification provenance on demand.
When SDAIA or a Ministry of Health auditor asks how a record was anonymised, hand them the hashed log entry, the policy version, and the human reviewer's signature.
Move data abroad inside the PDPL fence.
Anonymised data is exempt from PDPL's cross-border restrictions. VeilHealth's output, paired with the audit trail, is what your data-protection officer signs off on.
What VeilHealth is not.
Compliance buyers prefer to read the limits before the features. We agree.
What it does
- Detects and redacts 19 PII classes in Arabic + English clinical text.
- Runs entirely on your infrastructure — Docker, K8s, or appliance.
- Produces a hashed, exportable audit trail per record.
- Supports OCR for scanned forms in Arabic and English.
- Provides an encrypted re-identification map for authorised use.
What it doesn't do
- Replace your DPO or legal-compliance review of data transfers.
- Guarantee 100% recall — current MSA Arabic recall is ~72%, English ~96%. We recommend a human-in-the-loop review pass.
- De-identify imaging pixel data (DICOM headers only — pixel-level redaction is on the 2026 roadmap).
- Send any data to external services. There is no SaaS mode.
- Provide ML training itself — we anonymise; your team trains.
Open. Auditable. Built for this region.
Adapted and re-trained for clinical Arabic and Saudi healthcare entity formats.
Inspectable, forkable, no proprietary blob. Read every line that touches your patients' data.
Designed against the Saudi PDPL and UAE Federal Data Protection Law. Mappable to GDPR Article 4(5).
No outbound calls, no telemetry. Runs in fully isolated networks where many hospital workloads have to live.
Ten hospitals. Eight weeks. One de-identified dataset.
We're onboarding ten partners for the 2026 Q3 cohort. Each pilot includes deployment support, a tailored MSA fine-tune on your own corpus, a PDPL impact assessment, and direct access to the engineering team. No fee during the pilot window.