Introducing SMQL: Search 343K+ Scans with 120+ Filters

Why a Query Language?
ScanMalware processes thousands of URL scans daily, collecting data across dozens of dimensions: TLS certificates, JavaScript behavior, WHOIS registration, technology stacks, IDS alerts, malware patterns, and more. Basic text search gets you started, but real threat hunting requires combining signals.
SMQL (ScanMalware Query Language) lets you write precise queries that cross-reference any combination of our 120+ filters and 22 existence checks. Think of it as SQL for security scan data, but with a syntax designed for quick, interactive use.
Quick Start
SMQL queries are built from filters, boolean operators, and modifiers:
field:value # Exact match
field:>value # Comparison (>, <, >=, <=)
field:value1..value2 # Range
field:*wildcard* # Wildcard with * and ?
has:feature # Existence check
-field:value # Negation
filter1 AND filter2 # Boolean AND (implicit between terms)
filter1 OR filter2 # Boolean OR
(filter1 OR filter2) AND ... # Grouping with parentheses
sort:newest # Sort order
Try it now at scanmalware.com/search-advanced.
Real-World Examples
Finding Phishing Campaigns
Search for pages mimicking PayPal that our AI confirmed as scams:
title:*paypal* -domain:paypal.com verdict:CONFIRMED_SCAM
Newly Registered Domains with Suspicious Verdicts
Combine WHOIS domain age with security verdicts:
domain_age:<30 verdict:CONFIRMED_SCAM
Expired Certificates in the Last Month
Find sites with expired TLS certificates from recent scans:
cert_expired:true submitted:>last30d
Self-Signed Certificates with IDS Alerts
Cross-reference TLS anomalies with network intrusion signatures:
cert_self_signed:true has:ids
Obfuscated JavaScript with High Risk Scores
Hunt for heavily obfuscated scripts that triggered behavioral risk scoring:
js_obfuscated:true js_risk_score:>70
WordPress Sites Behind Cloudflare with Bot Protection
technology:WordPress bot_detection:cloudflare
Filter Categories
SMQL covers 120+ filters organized into 14 categories:
| Category | Filters | Examples |
|---|---|---|
| Core | 12 | url, domain, title, status, submitted, http_status |
| Network | 7 | ip, asn, asn_org, country, city, ip_count |
| WHOIS/RDAP | 8 | registrar, domain_age, nameserver, rir |
| Security | 11 | verdict, ai_risk_score, clamav, rpki, ioc |
| TLS/Certificates | 21 | cert_issuer, cert_expired, key_algorithm, key_size, cert_risk, ct_logged |
| JARM | 2 | jarm, jarm_known |
| Technologies | 4 | technology, tech_category, cpe |
| JavaScript | 15 | js_risk, js_eval, malware_family, obfuscation_score |
| Hashes | 9 | phash, favicon_hash, js_hash, ssdeep, tlsh |
| Tracking | 4 | tracker, tracking_id, tracker_category |
| Content | 16 | ocr, bot_detection, ids_category, clearfake_type, bundler |
| CT | 5 | ct_domain, ct_san, ct_hash |
| DNS | 2 | ns, zone_domain |
| rDNS | 2 | rdns, ptr |
Plus 22 existence checks with the has: prefix: has:malware, has:certificate, has:ids, has:phishing, has:pastejacking, has:clearfake, has:clone, has:webpack, has:pcap, has:ct, and more.
Comparison and Range Operators
Numeric and date filters support comparison operators:
load_time:>5 # Pages slower than 5 seconds
key_algorithm:RSA key_size:<2048 # Weak RSA keys
ai_risk_score:>7 # High AI risk scores (scale 0-10)
domain_age:<30 # Domains younger than 30 days
cert_days:<7 # Certificates expiring within a week
submitted:>last24h # Scans from the last 24 hours
Range queries use the .. syntax:
http_status:400..499 # All 4xx client errors
js_risk_score:60..100 # High to critical JS risk
submitted:2025-06..2025-12 # Second half of 2025
Boolean Logic
SMQL supports AND, OR, NOT, and parentheses for grouping. Adjacent terms are implicitly AND-ed:
# These are equivalent:
technology:WordPress country:RU
technology:WordPress AND country:RU
# OR requires explicit operator:
technology:WordPress OR technology:Joomla
# Grouping:
(technology:WordPress OR technology:Joomla) AND country:CN
# Negation (two forms):
-domain:google.com
NOT domain:google.com
Performance
SMQL queries are optimized for speed and safety:
- Count capping limits result counting to 10,000 rows for fast pagination on broad queries
- Caching with stampede protection reduces load on repeated queries
- 15-second timeout with clear error messages if a query is too complex
Every query result includes an API link that opens the raw JSON response, making it easy to integrate SMQL into scripts and automation -- including the ScanMalware CLI.
API Access
SMQL is available as a REST API endpoint:
GET /api/v1/search/smql?q=<query>&page=1&limit=20&sort=newest
The response includes paginated results, query timing, and filter metadata:
{
"query": "technology:WordPress AND country:RU",
"results": [...],
"pagination": {
"total_items": 463,
"total_pages": 24,
"exact_count": true
},
"query_time_ms": 42.3
}
The filter reference endpoint at /api/v1/search/smql/filters returns all available filters with descriptions, examples, and supported operators -- useful for building autocomplete interfaces. Full API documentation is available at /api/docs.
Try It
Visit scanmalware.com/search-advanced to start querying. The search page includes example queries, an inline filter reference, and a sort dropdown. Click the API link in any result to see the raw JSON.
We're actively expanding SMQL -- upcoming features include filters for Certificate Transparency data, DNS zone lookups, and reverse DNS searches. To see SMQL in action for real-world threat hunting, check out our analysis of the ShinyHunters phishing kit.