How Observation Frequency Affects Visibility Data: Daily vs Weekend-Only Logging
2026-03-16
Why is Yonaguni's 24.5m average so trustworthy? Because it's logged almost every day. Meanwhile, sites with weekend-only data may look clearer than they really are.
Key Insight
Weekend-only data may contain 'fair weather diving' selection bias. Shops that only operate and log on good-weather weekends tend to report higher-than-actual average visibility. Sites with daily logging are the most reliable data sources.
Daily-Logging Sites: Most Reliable Data
| Site | Total Obs | Frequency |
|---|---|---|
| Yonaguni | 4,826 | Near-daily |
| Kushimoto | 3,168 | Near-daily |
| IOP | 3,151 | Near-daily |
| Futo | 3,493 | Near-daily |
Yonaguni (4,826 obs) has the most observations with near-daily logging, including bad weather days. This gives the closest representation of 'true average visibility.' Kushimoto and IOP similarly have high-frequency logging and reliable data.
Weekend-Heavy Example: Osezaki Tip
335
Weekend observations
99
Weekday observations
Osezaki Tip has a 3.4:1 weekend-to-weekday ratio. If recording were evenly distributed across all 7 days, you'd expect about 2.5:1. The higher 3.4:1 ratio suggests that poor-weather weekdays are underrepresented — likely because shops don't operate (and therefore don't record) on bad weekdays.
Types of Observation Frequency Bias
Fair Weather Bias
Shops that only operate on good weather days (which tend to have better visibility) will show inflated average visibility. Bad weather days are simply missing from the dataset.
Seasonal Bias
Shops that only operate in summer don't capture full-year averages. For sites where winter has the best visibility, summer-only data underestimates the annual average.
Reporting Bias (Notable Days Only)
Some shops only blog about exceptionally good or bad days. This leads to overrepresentation of extreme values and inflated variance compared to reality.
How We Address This
- Showing observation counts: We always display the number of observations per site so readers can judge reliability.
- Outlier removal: Physically impossible values (100m+) and obvious errors are removed.
- ML prediction correction: Our machine learning models train on weather data alongside visibility, partially correcting for observation bias automatically.
- Multiple data sources: We collect from multiple shops and sources per site when possible.
Data Reliability Guide
About the Data
Observation frequency analysis derived from day-of-week distribution of each site's dates. Weekend bias assessed by the weekend-to-weekday observation ratio. Analysis covers 46,000+ total observations. Data collection methods are based on publicly available information from each dive shop.
🌊 Check Visibility Forecasts
View AI-powered 7-day visibility forecasts for 30+ dive sites across Japan.
Open Forecast App →