Why You Can Trust Our Data: 46,000 Dive Logs Collected Over 18 Years
2026-03-16
Why you can trust our data — starting from 118 observations in 2006, our diving visibility database has grown to 46,000+ over 18 years. Here's how we collected and verified it all.
46,483
Total observations
42
Dive sites
20 yrs
Data span
8+
Source types
Data Accumulation History
Cumulative observation count. Major jump in 2026 from full site integration.
Diversity of Data Sources
Japanese dive shops publish daily logs on various blog platforms. No single API covers them all — we had to build custom scrapers for each platform.
| Source | Obs | Key Sites |
|---|---|---|
| ExBlog | 7,800 | Yonaguni, Osezaki |
| WordPress REST API | 5,200 | Ito, others |
| CSV (manual) | 4,460 | IOP, Akinohama |
| Custom site scrape | 12,000 | Futo, Kushimoto, Kumomi, Echizen |
| Hatena Blog | 2,095 | Omijima |
| FC2 Blog | 1,533 | Kerama |
| Blogspot | 1,392 | Tajiri |
| Wix Blog | 2,696 | Hirasawa |
| Others | 9,307 | Ishigaki, Kerama, etc. |
Top 10 Sites by Observation Count
| # | Site | Obs | Since |
|---|---|---|---|
| 1 | Yonaguni | 4,826 | 2010 |
| 2 | Futo | 3,493 | 2013 |
| 3 | Kushimoto | 3,168 | 2015 |
| 4 | IOP | 3,151 | 2006 |
| 5 | Hirasawa | 2,696 | 2015 |
| 6 | Echizen | 2,652 | 2012 |
| 7 | Mikomoto | 2,263 | 2011 |
| 8 | Omijima | 2,095 | 2016 |
| 9 | Kumomi | 1,980 | 2018 |
| 10 | Ito | 1,980 | 2016 |
Yonaguni Leads by a Wide Margin
Yonaguni Diving Service (YDS) has posted to ExBlog almost daily since 2010, contributing 4,826 records. This consistent logging culture is the backbone of the database's value.
Challenges in Building the Database
Challenge 1: Inconsistent Formats
Shops write '透明度15m', 'vis 15', '15〜20m', '透視度10m↑' — all different formats. Extracting min/max values required site-specific regex patterns.
Challenge 2: Outliers and Errors
Miyakejima 215m typo, Futo's Saipan trip log mix-up, Amami's physically impossible 100m — we manually identified and removed 11 outliers. Data quality control is an ongoing effort.
Challenge 3: Blog Closures and Migrations
Dive shop blogs sometimes shut down or migrate. Hirasawa moved from Livedoor to Wix, requiring a completely new scraper. Continuous maintenance is essential.
The Value of This Database
Foundation for AI
46,000 observations train our LightGBM models for visibility and water temperature prediction. More data per site generally means better accuracy.
Seasonal & Long-term Trends
20 years of data reveals seasonal patterns and climate change / Kuroshio meander impacts. Trends invisible in short-term data become apparent.
Practical Info for Divers
Data-driven answers to 'when and where to dive for best visibility.' Not anecdotal — based on thousands of real measurements.
About the Data
46,483 observations collected from 42 nationwide dive shop daily logs (2006 – Mar 2026). 11 outliers removed. Scrapers run 3x daily via GitHub Actions for continuous updates.
🌊 Check Visibility Forecasts
View AI-powered 7-day visibility forecasts for 30+ dive sites across Japan.
Open Forecast App →