NBN Upload vs Cloud AI: When Local Inference Wins

NBN upload speeds are rarely the bottleneck for cloud AI use in Australia. Latency is. Sending a 500-word prompt to a cloud AI server takes under one second on any NBN plan, because text is small. What slows down interactive AI use is the round-trip time between Australia and AI API servers in the United States, which adds 180 to 250 milliseconds to every request. For multi-turn conversations, that latency compounds. Local AI inference running on a NAS or mini-PC responds in 10 to 50 milliseconds to the first token, regardless of your NBN plan. That difference in responsiveness is what makes local inference feel faster for conversational use, not bandwidth.

ⓘ

In short: NBN upload speed is not why local AI wins in Australia. Latency to US-based AI servers (180 to 250ms round-trip) and privacy are the stronger arguments for running inference locally. Local AI wins for interactive conversation, privacy-sensitive tasks, and offline use. Cloud AI wins for cutting-edge model access, large context windows, and occasional use where hardware investment is not justified.

What Australian NBN Upload Speeds Actually Look Like

NBN plans are marketed by download speed. Upload speeds are lower and, on many plan types, not clearly advertised. The ACCC publishes quarterly broadband performance data showing actual measured speeds across ISPs. The picture for typical Australian households is considerably less impressive than the plan names suggest.

NBN 100 (the most common plan tier) delivers typical upload speeds of 15 to 20 megabits per second on FTTC and FTTP connections. On FTTN (which is the most common access technology for many Australian homes), typical upload speeds are 10 to 17 megabits per second and can vary significantly based on line quality and distance to the node. NBN 50 plans typically deliver 17 to 18 megabits per second upload. NBN 25 plans cap upload at 5 megabits per second. Fixed Wireless NBN often has asymmetric speeds with limited upload. Starlink is variable from 5 to 20 megabits per second upload depending on load and satellite generation.

NBN 1000 (FTTP)	Typical upload: 50 to 100 Mbps. Best case for cloud AI prompts.
NBN 250 (FTTP)	Typical upload: 25 to 50 Mbps. Strong for large document sends.
NBN 100 (FTTC/FTTP)	Typical upload: 15 to 20 Mbps. Adequate for all text AI use.
NBN 100 (FTTN)	Typical upload: 10 to 17 Mbps. Variable by line quality.
NBN 50	Typical upload: 17 to 18 Mbps. Fine for text. Slow for large files.
NBN 25	Typical upload: 5 Mbps. Adequate for text. Will feel slow for long documents.
NBN Fixed Wireless	Variable: 5 to 25 Mbps upload. Can be congested during peak hours.
Starlink	Variable: 5 to 20 Mbps upload. Suited to rural use where NBN is unavailable.

Why Latency Matters More Than Upload Speed for AI

A 500-word prompt is approximately 3.5 kilobytes of text data. At an upload speed of 10 megabits per second, that transmits in 0.003 seconds. Even at 5 megabits per second on an NBN 25 plan, the same prompt takes 0.006 seconds to upload. For text-based AI use, upload speed is essentially irrelevant. The bottleneck is not getting your prompt to the server.

The bottleneck is where the server is and how long the round-trip takes. OpenAI, Anthropic, and Google serve their primary AI APIs from data centres in the United States and Europe. The round-trip time from Australia to those servers is typically 180 to 250 milliseconds. Every multi-turn conversation exchange includes that overhead. For a simple question-and-answer interaction, 200 milliseconds is not noticeable. For a rapid back-and-forth coding or debugging session with ten or twenty exchanges, the cumulative latency adds up to seconds of waiting that would not exist with a local model.

Streaming responses help but do not eliminate this. Cloud AI APIs stream responses token by token, which means the first token arrives after the round-trip latency has elapsed (200 to 300 milliseconds to first token for US-based servers). Local inference generates the first token in 10 to 50 milliseconds on typical hardware. That immediacy changes the feel of interactive AI use significantly, even when local models generate tokens more slowly overall.

When Local AI Wins in Australia

Local inference running on a NAS or mini-PC serves the local network with round-trip times under 5 milliseconds for wired connections. That responsiveness makes interactive conversation, code assistance, and rapid iteration feel significantly different from cloud AI accessed across the Pacific. For Australian users on any NBN tier, the latency argument for local inference is compelling.

Privacy is the second strong argument. Every prompt sent to a cloud AI service is processed by that provider's infrastructure. For legal documents, medical records, business contracts, customer data, and personal communications, sending that content through a cloud provider is a privacy trade-off that some users and organisations are not willing to make. Local inference processes all data on hardware you control, on your network, and nothing leaves your environment.

Offline use is a third consideration. NBN outages, particularly on FTTN connections in older areas, are a genuine disruption. Local AI inference continues working without internet connectivity. A local model running on a NAS or mini-PC is accessible to all devices on the home or office network regardless of whether the NBN connection is working.

Local AI vs Cloud AI for Australian NBN Users

	Local AI (NAS or mini-PC)	Cloud AI (ChatGPT, Claude, Gemini)
First token latency	10 to 50ms (local network)	200 to 300ms (US servers via NBN)
Upload speed requirement	Not applicable	Trivial. Text is tiny even on NBN 25
Works without internet	Yes, fully offline	No, requires active NBN connection
Privacy	All data stays on your hardware	Prompts processed by provider's infrastructure
Model capability	Limited by local hardware (typically 7B to 13B)	Access to largest frontier models (GPT-4o, Claude 4, etc.)
Context window	Typically 4K to 32K tokens depending on model	Up to 1M+ tokens on leading cloud models
Cost at high usage	Fixed hardware cost, zero per-query cost	Per-query API cost can scale rapidly
Remote access on CGNAT	Requires tunnel (Tailscale or Cloudflare)	Works on any connection without configuration
Multi-user household	One device serves everyone on the network	Separate subscription or API cost per user

When Cloud AI Still Makes Sense Despite NBN

The largest cloud AI models are not available locally. GPT-4o, Claude Sonnet 4, and Gemini Ultra are not models that can be downloaded and run on local hardware. They require data centre scale infrastructure. If your use case requires frontier-level model capability, local inference cannot match that regardless of NBN speed.

Context window size is the other hard constraint. Cloud AI models support context windows of 128,000 tokens or more on leading services. Local models running on 16GB of RAM are typically limited to 4,000 to 32,000 tokens depending on the model and configuration. Feeding a large codebase, a long document collection, or an extensive conversation history requires the context window that only cloud models currently offer at scale.

For light or occasional use, cloud AI also remains more practical. A household member who uses AI for an hour per week does not benefit from the investment in local AI hardware. At $30 per month, cloud AI is cheaper than even the power cost of keeping a dedicated mini-PC running for a casual user. Local AI makes sense when use is frequent enough to justify the hardware.

CGNAT and Remote Access to Your Local AI

Carrier-grade NAT (CGNAT) is a technology used by many Australian ISPs that prevents your home IP address from being directly accessible from the internet. The majority of Australian residential NBN connections are on CGNAT. Telstra, Optus, TPG, iiNet, Exetel, and most resellers use CGNAT by default on NBN. Aussie Broadband is one of the notable exceptions that assigns publicly routable IP addresses and provides static IP options.

CGNAT does not affect local AI inference on your home network. It only matters if you want to access your local AI server from outside your home, such as from a mobile device over 4G or 5G. Without a publicly accessible IP address, you cannot point an external device directly at your home server.

The practical workaround is a VPN tunnel service. Tailscale creates an encrypted mesh network between your devices that works through CGNAT. Your phone running Tailscale can reach your home mini-PC running Ollama as if they were on the same local network, regardless of CGNAT. Cloudflare Tunnel is an alternative that creates an outbound-only connection through Cloudflare's infrastructure, also bypassing CGNAT without requiring an inbound port. Both services have free tiers suitable for personal use.

💡

CGNAT workaround for accessing local AI remotely: Tailscale is the lowest-friction option for most home users. Install Tailscale on your mini-PC or NAS and on your phone or laptop. Your Ollama endpoint becomes accessible as a Tailscale IP address from any location, without any router port-forwarding or static IP requirement. The free tier supports up to 100 devices. See the Tailscale remote access guide for configuration steps.

What NBN Speed Actually Does Affect

While upload speed does not meaningfully impact typical AI prompt submission, there are scenarios where NBN bandwidth does matter for AI-adjacent workflows. Downloading model files is the most common example. A 7B model at Q4_K_M quantisation is approximately 4 to 5 gigabytes. At an NBN 100 download speed of 70 megabits per second, that downloads in approximately 8 minutes. At NBN 25 speeds (25 megabits per second), the same file takes around 27 minutes. For users on slower NBN tiers in rural or regional areas, downloading large model files requires patience but is a one-time task rather than an ongoing constraint.

Uploading large documents for retrieval-augmented generation (RAG) pipelines is another scenario where upload speed has some relevance. Sending a 100-page PDF (approximately 500 kilobytes) at 5 megabits per second upload takes 0.8 seconds. That is not a meaningful delay even on NBN 25. Multi-gigabyte datasets for local fine-tuning are a different matter, but that is well outside typical home AI use.

The practical conclusion for Australian NBN users is that any NBN connection with at least 25 megabits per second upload (NBN 50, 100, or better) handles cloud AI workloads without meaningful bandwidth constraint. The argument for local AI in Australia is not about overcoming NBN limitations. It is about latency, privacy, cost at scale, and offline availability.

Methodology (Real-World, AU-Verified)

Need to Know IT is an independent resource focused on storage and infrastructure decisions. Recommendations are based on official specifications, vendor documentation, and real-world deployment considerations, including availability, warranty, connectivity, and running costs.

Where relevant, guidance is grounded in Australian conditions and pricing, while remaining applicable to global audiences. Our tools and calculators are designed to reflect real-world usage scenarios, not theoretical maximums.

Updates & corrections: Content is reviewed and updated as products change. If you spot an error, contact the editorial team and we'll investigate and correct it.

Related reading: our NAS buyer's guide, our remote access and VPN guide, and our NAS vs cloud storage comparison.

Free tools: NAS Sizing Wizard and NBN Remote Access Checker. No signup required.

Related reading: our NAS explainer.

Use our free AI Hardware Requirements Calculator to size the hardware you need to run AI locally.

Does slow NBN upload speed affect ChatGPT or Claude response quality?

No. Text prompts are small enough that even NBN 12 or NBN 25 upload speeds transmit them in milliseconds. Upload speed does not affect the quality of cloud AI responses, only how quickly the prompt reaches the server. The far larger factor is round-trip latency from Australia to US-based AI servers (180 to 250ms), which affects how quickly you receive the first token of a response. Upload speed becomes relevant only if you are sending very large files such as multi-gigabyte datasets, which is rare for typical AI use.

Can I access my home Ollama server from my phone over mobile data?

Yes, but it requires a tunnel if your ISP uses CGNAT. Most Australian residential connections are on CGNAT, which means your home IP is not directly accessible from the internet. Tailscale is the recommended solution for most home users. Install Tailscale on your NAS or mini-PC running Ollama and on your mobile device. Your Ollama endpoint will be accessible as a Tailscale IP address from any network, including mobile data, without any router configuration. The Tailscale free tier supports this without any ongoing cost.

Which Australian ISPs use CGNAT on NBN?

The majority of Australian ISPs use CGNAT on residential NBN connections. Telstra, Optus, TPG, iiNet, Exetel, Superloop, and most smaller resellers use CGNAT by default. Aussie Broadband is one of the prominent exceptions and assigns publicly routable IP addresses without requiring an extra subscription. If remote access to a home server is a priority, Aussie Broadband or an ISP that offers static IPs as an included feature is worth considering when choosing a plan.

Is local AI faster than cloud AI on a slow NBN connection?

For interactive conversation, yes. Local AI responds to queries across the local network in under 5 milliseconds round-trip, regardless of your NBN connection. Cloud AI adds 180 to 250 milliseconds of network latency on top of the server's processing time because requests travel to the United States and back. The trade-off is that local models generate tokens more slowly once processing begins. A cloud model may generate 100 tokens per second, where a mid-range local mini-PC generates 10 to 25 tokens per second. Local inference feels more responsive to start but may take longer to finish a long response.

Does using cloud AI on NBN have any privacy risks?

Yes. Every prompt sent to a cloud AI service is transmitted over the internet and processed by the provider's infrastructure. For general queries this is not a concern, but for sensitive content including legal documents, medical records, business contracts, personal communications, or anything containing private data, cloud AI introduces a genuine privacy trade-off. Cloud providers have data retention and usage policies that vary, and jurisdiction questions apply when data is processed offshore. Local AI inference processes everything on hardware you own and control. Nothing leaves your network. For privacy-sensitive workloads, local inference eliminates this concern entirely.

Setting up local AI inference on a NAS or mini-PC? The local AI hardware comparison covers performance tiers, RAM requirements, and which devices are stocked in Australia.

See the Local AI Hardware Guide