This is part three of a blog post series about Snowplow on AWS Fargate.
- Part 1: Snowplow on AWS Fargate
- Part 2: Snowplow on AWS Fargate - Task Role
- Part 4: Snowplow on AWS Fargate - IAM Permissions
This post will outline common problems (and their solutions) encountered when running the Snowplow Stream Enrich container on AWS Fargate.
Name does not resolve
The is a common problem with ECS agent not properly setting up the
/etc/hosts file and appears as well on AWS Fargate.
Example stack trace:
02:58:45 Exception in thread "main" java.net.UnknownHostException: 3b76b684dc30: 3b76b684dc30: Name does not resolve 02:58:45 at java.net.InetAddress.getLocalHost(InetAddress.java:1505) 02:58:45 at com.snowplowanalytics.snowplow.enrich.stream.sources.KinesisSource.run(KinesisSource.scala:117) 02:58:45 at com.snowplowanalytics.snowplow.enrich.stream.KinesisEnrich$.main(KinesisEnrich.scala:81) 02:58:45 at com.snowplowanalytics.snowplow.enrich.stream.KinesisEnrich.main(KinesisEnrich.scala) 02:58:45 Caused by: java.net.UnknownHostException: 3b76b684dc30: Name does not resolve 02:58:45 at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method) 02:58:45 at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928) 02:58:45 at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323) 02:58:45 at java.net.InetAddress.getLocalHost(InetAddress.java:1500)
After some investigation, this Stack Overflow post got me going in the correct direction with some slight modifications required.
First, the ethernet interface in AWS Fargate is going to be
eth0. Second, if you’re using the official snowplow docker images, they already define an
ENTRYPOINT instruction, so you’ll need to override this. Write your own and call the original.
Bucket not found
This is a problem related to the “enrichment” configuration files and the defaults defined here.
The problem can crop up in a number of ways (based on your config) but may look something like these:
NonEmptyList(The bucket is in this region: eu-west-1. Please use this region to retry the request (Service: Amazon S3; Status Code: 301 NonEmptyList(Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: xxx; S3 Extended Request ID: sjCFZle+xxx/xxx=), Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: xxx; S3 Extended Request ID: xxx/xxx/xxx/xxx=)) 02:18:58 Exception in thread "Thread-6" java.net.UnknownHostException: snowplow-hosted-assets-us-west-2 02:18:58 at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184) 02:18:58 at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) 02:18:58 at java.net.Socket.connect(Socket.java:589) 02:18:58 at java.net.Socket.connect(Socket.java:538)
The UA Parser and Geolite IP Lookups enrichments can both be configured to pull resources down from S3 buckets, however they use slightly different syntax. The ip_lookups.json file uses a
http:// URI scheme to a file in an S3 bucket while the us_parser_config.json uses
I switched both of mine to
http:// URI schemes to work around the problem, e.g.
Snowplow will eventually stop self hosting the Geolite databases so you will need to host them yourself. See the original deprecation notice for more details.
"uri": "http://snowplow-hosted-assets.s3.amazonaws.com/third-party/maxmind" "uri": "http://snowplow-hosted-assets.s3.amazonaws.com/third-party/ua-parser"
Check out the next post in this series, Snowplow on AWS Fargate - IAM Permissions, which covers IAM permissions required to run all Snowplow processes on AWS Fargate.