Security Scan Report: github.com

Submitted: May 14, 2026, 5:52:05 PMCompleted: May 14, 2026, 5:53:22 PMpubliccompleted
Loading additional data...

Summary

This website contacted 2 IPs in 2 countries across 2 domains to perform 111 HTTP transactions. The main domain is github.com and was registered NaN years ago.

Submitted URL: https://github.com/pdf2htmlEX/pdf2htmlEX/blob/master/share/LICENSE=20

The Cisco Umbrella rank of the primary domain is #1,803 of the top 1 million websitesTop 10K Site

AI Security Verdict

Moderate Risk

Confidence: 78%

4
Risk Score

Legitimate GitHub site showing login forms, but critical IDS alerts raise moderate concern for possible malware activity.

Risk Factors
Critical IDS alerts suggesting data exfiltration
Potential C2 beacon detected by IDS
Credential collection forms (login) on the page
Safety Factors
Well‑established domain (≈18 years old)
Brand matches domain (self‑branding, not impersonation)
No malicious Indicators of Compromise found
No JavaScript YARA malware patterns detected
No cross‑origin credential exfiltration observed
Domain age information unavailable

Details

Page Title

File not found · GitHub

Scan Type

public

Language

🇺🇸

English

(80% confidence)

Category

technology software

(70%)

Domain Information

Domain 'github.com' uses the commercial generic top-level domain (.com) while skipping any subdomain. Count 6 characters in 'github' split between 2 vowels and 4 consonants. Tokenizing the label suggests 3 words: g, it, hub. Median word length is 2 characters. No strong language cues emerged from the frequency lists.

Screenshot

Security scan screenshot of https://github.com/pdf2htmlEX/pdf2htmlEX/blob/master/share/LICENSE=20

Page Load Overview

7.58s
Total Load Time
147
HTTP Requests
4
Domains
1.6 MB
Total Size

Language Analysis

Primary Language

🇺🇸English
Code: en
Confidence:80%
Script:Latin
Direction:ltr

Detection Details

Language Code:en
Detection Confidence:80%
Script Type:Latin
HTML Lang Attribute:en
Text Length:3,434 chars
Detector Agreement:100%

Website Classification

Primary Category

technology software70% confidence
Type: spa
Method: ml+structural

All Detected Categories

technology software
70%
documentation technical
48%
forum community discussion
47%

Detected Features

Login Form
Search
OG: object

Domain & IP Information

RequestsIP AddressLocationAS Autonomous System
74140.82.121.4Frankfurt am Main, Hesse, Germany
AS36459GitHub, Inc.
73185.199.108.215United States
AS54113Fastly, Inc.
1472--

Content Similarity HashesFor malware variant detection

TLSH (Trend Micro Locality Sensitive Hash)

Security-focused

Specialized for malware detection and similarity analysis

T14A14C5B16178A53D016F2ACAF670D714D36BE31BEB8856E4B57F82B857C3C84EA43058

ssdeep (Context Triggered Piecewise Hashing)

Context-aware

Detects similar content even with modifications

6144:3zprQ355hkIbZwz/JZI7xqdn5pjUsvVm7MVqWX8ijE9WuqdeJ5PTQYbeHxHA7opD:3hQ355hkIbZwz/JZI7xqdn5pjUsvVm70

sdhash (Similarity Digest Hashing)

High-precision

High-precision similarity detection for forensic analysis

sdhash:3:207211:GIYA2IIiBEaLkDYaSATIEggApgKIYYTB0pgqPruEGJYARGSvKIAAUROzJIAiKSQCBUBF8MwoyDAMQbEARcHkB2FuJBGghCYQ

These hashes enable detection of similar websites and malware variants by comparing content similarity even when exact matches aren't found.

Image Hashes

Perceptual Hashes

Average Hash:3f3f1f1f7f7f7f7f
Perceptual Hash:801fafc1f1b8f8e0
Difference Hash:e060b0b0c0c08080
Wavelet Hash:1e1e0e3e0e0e4e4e
Color Hash:#77d22d

Other Hashes

Crop Resistant:e060b0b0c0c08080

Scan History

Scan history not available

Unable to load historical scan data