Download Pdf From Url In Swift
Apple has a sophisticated caching system in iOS, which is enabled by default. However, documentation around URLCache
is quite sparse. Today, we'll look at the behavior of caching when dealing with large files.
Motivation, aka Why Not Build Your Own Cache?
Why even bother having a cache? After all, once a file is downloaded, it's stored locally and can be accessed offline. However, caching logic helps us solve another hard problem: detecting if a file changes on the server.
Getting this correct is not trivial. There are a variety of HTTP headers that can help verify if a file needs to be downloaded again: Expires
, ETag
, Last-Modified
, and If-Unmodified-Since
. They are also functionally different. For example, while ETag
makes it possible to reliably compare fingerprints, Last-Modified
requires more guesswork/heuristics. The two can also be used together.
Apple spent years building logic that correctly takes all of these details into account, so let's try to reuse what the system offers.
ℹ️ Note: Since we're only using Foundation objects in this article, everything applies to all Apple platforms — from iOS to macOS to tvOS and watchOS. All code here is written in Swift 5, but everything can be used equally in Objective-C, where objects are prefixed (e.g.
NSURLCache
orNSURLSession
).
Caching isn't magic. The system needs a hint from the server as to how long a file should be cached or how to verify it. If your server doesn't emit any hints for the caching system, files won't be cached. Here's a simple way to see that HTTP headers are using curl
:
curl -i https://www.hq.nasa.gov/alsj/a17/A17_FlightPlan.pdf HTTP/1.1 200 OK Date: Wed, 06 Nov 2020 10:24:34 GMT Server: Apache Strict-Transport-Security: max-age=63072000; includeSubdomains; preload X-Content-Type-Options: nosniff X-Frame-Options: SAMEORIGIN Last-Modified: Sun, 19 May 2002 14:49:00 GMT Accept-Ranges: bytes Content-Length: 20702285 Content-Type: application/pdf
In this example, Last-Modified
is set so caching will work with an algorithm based on how old the modified date is. This "freshness lifetime" algorithm is defined in RFC2616 and is 10 percent of the current age (proof).
([Sun, 19 May 2002 14:49:00 GMT] - [Fri, 6 Nov 2020 12:00:00 GMT]) * 10% 6,745 days, 19 hours, 35 minutes, and 34 seconds * 10%
= The cache for the above file is valid for ~674 days.
Cache invalidation works differently, depending on the tags used. If only ETag
is set, the system must query the server every time (refer to the decision tree from Apple below):
If you'd like to learn more about HTTP caching headers, Google has a great resource at web.dev.
Downloading Large Files on iOS and macOS
There are many ways files can be downloaded on iOS, however, the modern approach is using URLSession
with either a dataTask
or a downloadTask
. The main difference is storage. Data tasks return the data directly, while download tasks return a file URL. The returned file needs to be copied to a local destination in the completion handler to remain accessible.
Let's look at a complete example that downloads the Apollo 11 Flight Plan into the Documents directory of the current application (error handling is omitted for the sake of brevity):
let remoteURL = URL ( string : "https://www.nasa.gov/specials/apollo50th/pdf/a11final-fltpln.pdf" ) ! let documentURL = FileManager . default . urls ( for : . documentDirectory , in : . userDomainMask )[ 0 ] let targetURL = documentURL . appendingPathComponent ( remoteURL . lastPathComponent ) let downloadTask = URLSession . shared . downloadTask ( with : remoteURL ) { url , response , error in guard let tempURL = url else { return } _ = try ? FileManager . default . replaceItemAt ( targetURL , withItemAt : tempURL ) } downloadTask . resume ()
This will download the file correctly, but it'll likely not use the cache in a way you'd expect.
Caching Download Tasks
Download tasks support caching via either the default URLCache
or a custom URLCache
. However, there are a few details you need to know about:
-
Background Downloading — A download task can run in the background if the background
URLSession
is used. In this scenario, the download is managed by a system daemon which has no access to the app-local cache. -
Manual Storage — While download tasks will query the cache, they — unlike data tasks — don't automatically store the result in the cache.
In order to store the result of a download task, you need to manually call storeCachedResponse
on the cache:
let req = URLRequest ( url : remoteURL ) let downloadTask = URLSession . shared . downloadTask ( with : req ) { url , response , error in print ( "Download Task complete." ) if let response = response , let url = url , cache . cachedResponse ( for : req ) == nil , let data = try ? Data ( contentsOf : url ) { cache . storeCachedResponse ( CachedURLResponse ( response : response , data : data ), for : req ) } }
The URLCache
class has been thread-safe since iOS 8. Things weren't so great in earlier releases.
Verifying URLCache
Each app has a default, sandboxed cache that lives under <APP_ROOT>/Library/Caches/<APP_BUNDLE_ID>
. The default size isn't documented, but it can be queried easily. This has been tested on macOS Big Sur via Catalyst and might be different depending on the device:
(lldb) p (int)[[NSURLCache sharedURLCache] diskCapacity] (int) $0 = 20000000 // ~19 MB (lldb) p (int)[[NSURLCache sharedURLCache] memoryCapacity] (int) $1 = 512000 // ~500 KB
To effectively use URLCache
for file downloads, it needs to be much bigger. We can be quite bold with size requests, as iOS will clean up automatically as needed:
In iOS, the on-disk cache may be purged when the system runs low on disk space, but only when your app is not running (Source: Apple Documentation).
On disk, the cache is a regular SQLite database named Cache.db
, including the -shm
and -wal
files SQLite uses to improve performance. Binary files aren't stored in SQLite but in a separate folder named fsCachedData
. The data stored here isn't processed; it's the same data you downloaded. In our case, we can open the PDF by simply renaming and opening 49E5D82A-5749-4094-A934-5D61B767CBF0
.
Creating a Custom URLCache
The code block below will set up a cache with ~10 MB memory and ~1 GB disk cache space. We use the Caches
directory because it isn't backed up in iCloud, and we certainly don't want a cache to be backed up:
let cachesURL = FileManager . default . urls ( for : . cachesDirectory , in : . userDomainMask )[ 0 ] let diskCacheURL = cachesURL . appendingPathComponent ( "DownloadCache" ) let cache = URLCache ( memoryCapacity : 10_000_000 , diskCapacity : 1_000_000_000 , directory : diskCacheURL )
There is anecdotal evidence that the cache rejects files if they're more than 10 percent in size of the total cache size. So you really want to pick a generous size for the cache to make it work reliably if you're dealing with large files like we are in this example. The exact numbers don't seem to be documented.
Next, we can control which cache should be used on a per-request basis. Instead of using URLSession.shared
, we use a custom session object:
let config = URLSessionConfiguration . default config . urlCache = cache let session = URLSession ( configuration : config )
That's all that's needed! Your magical new disk cache is ready to go.
Accessing the Cache Offline
Using the cache can be controlled via the cachePolicy
setting on URLRequest
. The default is .useProtocolCachePolicy
, which usually does the right thing — including returning a cached copy if the content is new enough. Depending on your content, you might want to use .returnCacheDataElseLoad
in the offline case instead:
let req = URLRequest(url: flightPlanURL, cachePolicy: .returnCacheDataElseLoad)
ℹ️ Note: Depending on the cache rules, it will still load files that have already been deleted remotely, as the cache won't always hit the network.
Conclusion
Using URLCache
for large files is straightforward once one is aware of the gotchas around file downloading. Refer to this gist to see how all the snippets fit together to build reliable offline caching for downloaded files. We hope this helps some of you reuse Apple's caching instead of rolling your own.
Free 60-Day Trial Try PSPDFKit in your app today.
Free Trial
Source: https://pspdfkit.com/blog/2020/downloading-large-files-with-urlsession/
Posted by: dinorahdinorahhaniblee0277851.blogspot.com
Post a Comment for "Download Pdf From Url In Swift"