У меня есть корзина на AWS s3, в которой все объекты должны быть зашифрованы с помощью KMS. Я использую Presto на emr-5.2.1
У меня есть внешняя таблица на s3 (нет данных). Когда я использую
INSERT INTO hive.s3.new_table
SELECT * FROM src_table
Я получаю ошибку AccessDenied. Я протестировал несколько разных вариантов и обратился в службу поддержки, но безуспешно. Если я удалю политику из корзины, Presto будет работать нормально, но файлы, созданные на s3, не зашифрованы.
У Presto нет проблем с чтением зашифрованных внешних таблиц s3 или их локальным созданием на hdfs. Я не могу разрешить использование незашифрованных данных.
Пример политики:
{
"Version":"2012-10-17",
"Id":"PutObjPolicy",
"Statement":[{
"Sid":"DenyUnEncryptedObjectUploads",
"Effect":"Deny",
"Principal":"*",
"Action":"s3:PutObject",
"Resource":"arn:aws:s3:::YourBucket/*",
"Condition":{
"StringNotEquals":{
"s3:x-amz-server-side-encryption":"aws:kms"
}
}
}
]
}
http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingKMSEncryption.html
Конфигурация Presto /etc/presto/conf/catalog/hive.properties
hive.s3.ssl.enabled=true
hive.s3.use-instance-credentials=true
hive.s3.sse.enabled = true
hive.s3.kms-key-id = long_key_id_here
...
Error:
com.facebook.presto.spi.PrestoException: Error committing write to Hive
at com.facebook.presto.hive.HiveRecordWriter.commit(HiveRecordWriter.java:132)
at com.facebook.presto.hive.HiveWriter.commit(HiveWriter.java:49)
at com.facebook.presto.hive.HivePageSink.doFinish(HivePageSink.java:152)
at com.facebook.presto.hive.authentication.NoHdfsAuthentication.doAs(NoHdfsAuthentication.java:23)
at com.facebook.presto.hive.HdfsEnvironment.doAs(HdfsEnvironment.java:76)
at com.facebook.presto.hive.HivePageSink.finish(HivePageSink.java:144)
at com.facebook.presto.spi.classloader.ClassLoaderSafeConnectorPageSink.finish(ClassLoaderSafeConnectorPageSink.java:49)
at com.facebook.presto.operator.TableWriterOperator.finish(TableWriterOperator.java:156)
at com.facebook.presto.operator.Driver.processInternal(Driver.java:394)
at com.facebook.presto.operator.Driver.processFor(Driver.java:301)
at com.facebook.presto.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:622)
at com.facebook.presto.execution.TaskExecutor$PrioritizedSplitRunner.process(TaskExecutor.java:534)
at com.facebook.presto.execution.TaskExecutor$Runner.run(TaskExecutor.java:670)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: xxxxxx), S3 Extended Request ID: xxxxxxxxxxxxxx+xxx=
at com.facebook.presto.hive.PrestoS3FileSystem$PrestoS3OutputStream.uploadObject(PrestoS3FileSystem.java:1003)
at com.facebook.presto.hive.PrestoS3FileSystem$PrestoS3OutputStream.close(PrestoS3FileSystem.java:967)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:74)
at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:108)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:2429)
at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:106)
at com.facebook.presto.hive.HiveRecordWriter.commit(HiveRecordWriter.java:129)
... 15 more
Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: xxxxxxx)
at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1387)
at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:940)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:715)
at com.amazonaws.http.AmazonHttpClient.doExecute(AmazonHttpClient.java:466)
at com.amazonaws.http.AmazonHttpClient.executeWithTimer(AmazonHttpClient.java:427)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:376)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4039)
at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1583)
at com.amazonaws.services.s3.AmazonS3EncryptionClient.access$101(AmazonS3EncryptionClient.java:80)
at com.amazonaws.services.s3.AmazonS3EncryptionClient$S3DirectImpl.putObject(AmazonS3EncryptionClient.java:603)
at com.amazonaws.services.s3.internal.crypto.S3CryptoModuleBase.putObjectUsingMetadata(S3CryptoModuleBase.java:175)
at com.amazonaws.services.s3.internal.crypto.S3CryptoModuleBase.putObjectSecurely(S3CryptoModuleBase.java:161)
at com.amazonaws.services.s3.internal.crypto.CryptoModuleDispatcher.putObjectSecurely(CryptoModuleDispatcher.java:108)
at com.amazonaws.services.s3.AmazonS3EncryptionClient.putObject(AmazonS3EncryptionClient.java:483)
at com.amazonaws.services.s3.transfer.internal.UploadCallable.uploadInOneChunk(UploadCallable.java:131)
at com.amazonaws.services.s3.transfer.internal.UploadCallable.call(UploadCallable.java:123)
at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:139)
at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:47)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
... 3 more
Мне что-то не хватает в конфигурации или Presto не использует KMS при вставке в таблицы?
Согласно amazon: «Все запросы GET и PUT для объекта, защищенного AWS KMS, завершатся ошибкой, если они не будут выполнены через SSL или с использованием SigV4».
Can I use SSE-KMS with Presto? Unfortunately, no. Presto currently supports either SSE-S3(AES256) or ClientSideEncryption(CSE-KMS) EMR does support SSE-KMS for all the applications which use EMRFSFilesystem like Hive,Spark,MR and so on. Unfortunately, Presto uses PrestoFileSystem. It is the reason, any changes/improvements needs to be added directly to Presto.
- person KonradK   schedule 14.02.2017