In the previous step-by-step tutorial "2026 Latest Guide: Teach You to Build Rainyun Automatic Check-in for $0, Daily Auto-claim Points for Renewal", we have already implemented using GitHub Actions to automatically claim Rainyun points for free every day. Many readers have privately asked: "How does the script automatically pass the CAPTCHA? Does it use the 2captcha platform? Does it cost money?" The answer is: Not a single penny, pure local AI recognition. Today's article will take you deep into the automatic cracking technology behind the Rainyun check-in CAPTCHA in the Rainyun-Qiandao project, exploring why it can run stably in GitHub Actions' free environment, and what makes it superior to paid solutions like 2captcha.
Rainyun uses the Tencent TCaptcha (Waterproof Wall) CAPTCHA. If you've done check-ins, you've definitely seen it: a background image containing multiple small patterns, plus 3 target tiles that you need to click in order.
Simply put, it's a "Text/Pattern Click-to-Select" type CAPTCHA:
| CAPTCHA Type | Typical Example | Difficulty to Crack |
|---|---|---|
| Simple Digit CAPTCHA | Old Forums | ⭐ |
| Sliding Puzzle | Early Geetest, Tencent Slider | ⭐⭐ |
| Text/Pattern Click-to-Select | Tencent TCaptcha, Geetest v4 | ⭐⭐⭐⭐ |
| reCAPTCHA v3 | Google Behavioral Analysis | ⭐⭐⭐⭐⭐ |
The challenge of Tencent TCaptcha lies in: Unlike a simple slider that only requires calculating an offset, it requires you to accurately locate the coordinates of 3 target patterns from the background image and click them in sequence. This means the cracking program must not only "understand" what the target looks like but also "find" them within a large, colorful image.
Precisely because its front-end rendering mechanism is based on JavaScript dynamic loading, pure HTTP requests cannot obtain the real image data of the CAPTCHA. Therefore, the project chose to use Selenium to simulate a real browser.
Many automation tutorials recommend third-party human-solving platforms like 2captcha, Anti-Captcha. Their principle is simple: send the CAPTCHA image to the platform server, and after being solved by humans or AI, return the result. However, for the "Rainyun daily check-in" scenario, such solutions have several fatal problems:
| Comparison Dimension | 2captcha and other paid platforms | This Solution (ddddocr local recognition) |
|---|---|---|
| Cost | ~$2-3 / 1000 times | $0 (Completely Free) |
| Speed | 10~60 seconds (waiting for human) | 1~3 seconds (local inference) |
| Privacy | Account info passes through third-party | Entirely local, zero data transmission |
| Dependency | Requires internet, platform may shut down anytime | Runs offline, built into GitHub Actions |
| GPU Requirement | None (server-side processing) | None (CPU is enough, ddddocr is extremely lightweight) |
In other words, the 2captcha approach offers no cost-effectiveness in the context of "claiming free points." You're checking in to save money, but the solving fee is more expensive than the points?
The core idea of this project is "zero cost, pure local, fully automatic," using AI models to handle everything directly in the runtime environment.
The tech stack of this Rainyun automatic check-in CAPTCHA version solution can be summarized in one sentence: Selenium is responsible for "operating the browser," ddddocr is responsible for "understanding text," and OpenCV is responsible for "finding the location."
Let's break it down layer by layer.
Why use Selenium instead of sending HTTP requests directly? Because Tencent TCaptcha performs extensive front-end detection:
navigator.webdriver to determine if it's an automation toolThe project uses Chrome headless mode to run Selenium, optimized for server environments:
# Headless mode + 1080p window (avoid element overlap causing misclicks)
ops.add_argument("--headless")
ops.add_argument("--window-size=1920,1080")
ops.add_argument("--no-sandbox")
ops.add_argument("--disable-dev-shm-usage") # Docker/Actions memory optimization
Why set the window to 1920×1080? Mainly to simulate a real user's browser environment. A normal user wouldn't browse the web with an 800×600 window. Setting a standard resolution helps reduce the risk of being flagged as an automation tool by anti-fraud systems.
Simply running headless Chrome is far from enough; Selenium can be detected in various ways. The project implements two layers of protection:
Layer 1: stealth.min.js Injection
This file isn't handwritten; it's automatically extracted and generated from puppeteer-extra-plugin-stealth (7.2k+ ⭐).
The stealth plugin is originally exclusive to Puppeteer. It injects a series of anti-detection scripts when the browser opens a page via page.evaluateOnNewDocument(). The extract-stealth-evasions tool does something clever: it replaces evaluateOnNewDocument with a "listener" that only records the script content without executing it.
// Core trick: Monkey Patch
page.__proto__.evaluateOnNewDocument = function(func, args) {
// Don't execute, just concatenate the script source code
scripts += '(' + func.toString() + ')(' + JSON.stringify(args) + ');\n'
}
The complete generation process is as follows:
evaluateOnNewDocument and evaluate methodsscripts variablescripts with Terser, add a file headerstealth.min.jsGeneration command is one line:
npx extract-stealth-evasions # Generates stealth.min.js in the current directory
This file contains 16 evasion (anti-detection) modules, covering all common automation detection points:
| Module | Purpose |
|---|---|
navigator.webdriver | Removes webdriver property (most basic detection point) |
chrome.runtime | Masquerades chrome.runtime (most complex module) |
chrome.app / csi / loadTimes | Completes APIs only present in normal Chrome |
navigator.plugins | Masquerades browser plugin list |
navigator.languages | Masquerades language settings |
navigator.permissions | Masquerades permission query results |
navigator.hardwareConcurrency | Masquerades CPU core count |
iframe.contentWindow | Fixes iframe cross-origin detection |
media.codecs | Masquerades media codec support |
window.outerdimensions | Masquerades window dimensions |
user-agent-override | Masquerades UA and platform info |
sourceurl | Hides the source URL of injected scripts |
The injection method is also simple, automatically executed before each page loads via Chrome DevTools Protocol (CDP):
# Read the generated stealth.min.js
with open("stealth.min.js", "r") as f:
stealth_js = f.read()
# Inject into the browser, every page opened afterwards will automatically execute anti-detection scripts
driver.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", {"source": stealth_js})
The essence of this solution lies in: stealth.min.js was originally designed for Puppeteer (Node.js), but because it's ultimately just a piece of pure JS code, it can be injected into any browser via CDP, including Chrome driven by Selenium.
Layer 2: Account-Specific Browser Fingerprint
Generate deterministic User-Agent and fingerprint parameters based on account ID. The same account gets consistent fingerprint each run (simulating "the same person"), different accounts get different fingerprints (avoiding correlation detection):
# Generate a dedicated User-Agent for each account
user_agent = get_random_user_agent(account_id)
ops.add_argument(f"--user-agent={user_agent}")
# Inject deterministic browser fingerprint
fingerprint_js = generate_fingerprint_script(current_user)
driver.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", {"source": fingerprint_js})
ddddocr ("Dai Dai Di Di OCR") is an open-source general-purpose CAPTCHA recognition library by a Chinese developer, based on the ONNX Runtime inference engine. The project utilizes its two core capabilities:
| Function | Purpose | Corresponding Mode |
|---|---|---|
| OCR Text Recognition | Recognizes Chinese characters/digits/letters in patterns | DdddOcr(ocr=True) |
| Object Detection | Locates all possible candidate regions in a large image | DdddOcr(det=True) |
In multi-account concurrent scenarios, if each thread loads a copy of the model, memory would explode instantly. The project uses Double-Checked Locking to implement a global singleton:
_ocr_model = None
_det_model = None
_model_lock = threading.Lock()
_inference_lock = threading.Lock() # Inference lock, preventing concurrent inference conflicts
def get_shared_ocr_models():
global _ocr_model, _det_model
if _ocr_model is None or _det_model is None:
with _model_lock:
if _ocr_model is None or _det_model is None:
import ddddocr
_ocr_model = ddddocr.DdddOcr(ocr=True, show_ad=False)
_det_model = ddddocr.DdddOcr(det=True, show_ad=False)
return _ocr_model, _det_model
Even with 10 accounts running simultaneously, only one copy of the model is loaded, keeping memory usage within an acceptable range.
Relying solely on ddddocr to "recognize characters" is not enough. Targets in the CAPTCHA could be Chinese characters, irregular graphics, small icons, or even numbers. The project designs a two-stage, multi-algorithm fusion positioning strategy.
det mode to scan the large image, bounding all possible candidate regions containing targets.Simplified Scoring Formula:
if OCR_Recognition(target) == OCR_Recognition(candidate) and Shape_Score ≥ threshold:
Score = 75 + Shape_Score × 25 # Semantic hit, strong signal
elif Shape_Score ≥ 0.55:
Score = Shape_Score × 20 # High shape match
else:
Score = SIFT_Inlier_Count # Geometric feature fallback
If the total confidence score from Stage 1 is below a threshold (e.g., candidate boxes are poorly cropped, or background is too noisy causing object detection to miss), the system automatically degrades to full-image scan mode:
cv2.connectedComponentsWithStats to extract all dark connected components, filter by area to get reasonably sized candidates.cv2.matchTemplate for normalized cross-correlation matching.Finally, select the 3 points with the highest total score and sufficient distance from each other from the full-image search candidates as click coordinates.
The core idea of this two-stage design is: Be precise when possible (fast), resort to brute-force search when precision fails (stable).
After finding the pixel coordinates of the target in the original image, they need to be converted to the actual click position in the browser. This is because the rendering size of the CAPTCHA image in the DOM may differ from the original image size:
# Original image coordinates → DOM rendering coordinates
width_raw, height_raw = captcha.shape[1], captcha.shape[0] # Original image size
width, height = float(get_width_from_style(style)), ... # DOM rendering size
final_x = int(-width/2 + x / width_raw * width) # Offset relative to element center
final_y = int(-height/2 + y / height_raw * height)
# Use ActionChains to move to the precise position and click
ActionChains(driver).move_to_element_with_offset(slideBg, final_x, final_y).click().perform()
The recognition rate for Rainyun check-in CAPTCHA cannot be 100%. The original author's informal test data shows about 48.3% single-pass rate. But this is not a problem at all, because the project has a built-in multi-level automatic retry:
After each CAPTCHA submission, the script checks the class attribute of the result element:
show-success → passed, continue check-in process.This process is recursive; as long as the browser is alive, it will keep retrying.
If the entire check-in process fails (network timeout, page structure changes, etc.), there is an outer account-level retry mechanism:
CHECKIN_MAX_RETRIES).Let's calculate: Single-pass rate 48.3%, pass rate after 3 CAPTCHA-level retries = 1 - (1-0.483)³ ≈ 86.2%. Combined with 3 account-level retries, the final success rate approaches 99%+.
If you have already set up the basic check-in tutorial, the following advanced configurations can further improve stability:
PROXY_API_URL)If you have multiple accounts, they originate from the same GitHub Actions Runner, sharing the same IP. A large number of check-in requests from the same IP may trigger Rainyun's anti-fraud measures.
After configuring PROXY_API_URL, each account will request a new IP from the proxy interface before checking in, achieving "one account, one IP." Even if proxy acquisition fails, it automatically degrades to using the local IP, preventing check-in failure due to proxy issues.
The project utilizes GitHub Actions Cache to cache cookies after login. Benefits:
Cookies are saved in JSON format in the temp/cookies/ directory, filenames based on account hash, not leaking original account information.
Many people running headless browsers ignore window size settings, resulting in a very small default window (usually 800×600). This causes serious CAPTCHA recognition issues:
The project forces --window-size=1920,1080, ensuring a standard 1080p rendering space even in headless mode.
Controlled by the SCREENSHOT_MODE environment variable:
all: Screenshot every time, high traffic consumption but convenient for troubleshooting.failed_only (default recommended): Screenshot only on failure, saving resources otherwise.none: No screenshots at all, extremely lightweight.Additionally, when CAPTCHA recognition fails, the script automatically saves a "debug package" to the logs/captcha_debug/ directory, containing original CAPTCHA images, target tiles, metadata JSON, etc., very convenient for developer troubleshooting.
Yes. GitHub Actions provides unlimited free minutes for public repositories (private repos have a 2000-minute monthly limit). A single check-in takes about 3~5 minutes, so there's no need to worry about limits. ddddocr model inference also runs entirely on the Runner's CPU, requiring no external API calls.
No. ddddocr excels at single-character recognition of Chinese characters, English letters, and digits. For complex irregular graphics (like "click all hats"), the project relies on OpenCV's SIFT feature matching and Canny edge matching to compensate. This is why the project designs multi-algorithm fusion; there's no silver bullet, but a combination punch can be very stable.
Tencent TCaptcha's underlying framework is relatively stable (CAPTCHA images are in the #slideBg element). As long as the page structure doesn't change significantly, the script can continue working. If compatibility issues arise, we will update and adapt promptly. Welcome to report issues at the GitHub repository.
In the context of Rainyun check-in, absolutely not necessary. This solution's ddddocr local recognition is already the perfect replacement for Rainyun 2captcha check-in, with zero cost, low latency, and no privacy risk. Unless you need to crack pure behavioral analysis CAPTCHAs like reCAPTCHA v3, should you consider paid platforms.
| Technical Component | Role | Key Features |
|---|---|---|
| Selenium | Browser Automation | Headless mode, stealth.js anti-detection, account-specific fingerprint |
| ddddocr | Text/Object Recognition | CPU inference, dual-model singleton, thread-safe inference lock |
| OpenCV | Image Matching & Positioning | SIFT+RANSAC, Canny edge template, connected component analysis, binary shape IoU |
| Retry Mechanism | Fault Tolerance | CAPTCHA-level recursive retry + Account-level timed retry |
| Cookie Cache | Reduce Anti-Fraud Risk | Actions Cache persistence, hash naming, automatic expiration detection |
The essence of this Rainyun automatic check-in CAPTCHA version solution lies in: Not pursuing 100% single-attempt recognition rate, but pushing the final success rate above 99% through multi-algorithm fusion + multi-level retry.
If you haven't set up the basic automatic check-in yet, please refer first to: "2026 Latest Guide: Teach You to Build Rainyun Automatic Check-in for $0, Daily Auto-claim Points for Renewal (Step-by-Step Tutorial)". Deployment can be done in 3 steps.
Repository address: https://github.com/LeapYa/Rainyun-Qiandao
If you find it useful, don't forget to give the repository a Star ⭐!
🚨 Legal Risk Warning: The automated check-in and CAPTCHA recognition technologies discussed in this article may violate Rainyun Platform's "User Agreement" and related service terms, carrying risks such as account suspension and point reset. Such technical solutions belong to a gray area and are provided solely for learning Selenium automation, OCR recognition, and computer vision principles. Do not use them in production environments or for large-scale commercial purposes. Use implies you have fully evaluated the risks and voluntarily bear the consequences.