Plugin-498: Added content-type to GCS sink#462
Plugin-498: Added content-type to GCS sink#462CuriousVini merged 1 commit intodata-integrations:developfrom
Conversation
| } | ||
|
|
||
| if (contentType == null) { | ||
| contentType = DEFAULT_CONTENT_TYPE; |
There was a problem hiding this comment.
Shouldn't modify field in the validate method. Instead, introduce "getter" method do the defaulting logic in there. In the validate method, call the getter method for validation if needed.
There was a problem hiding this comment.
Please check the updated PR.
| private static final String RECORDS_UPDATED_METRIC = "records.updated"; | ||
| public static final String AVRO_NAMED_OUTPUT = "avro.mo.config.namedOutput"; | ||
| public static final String COMMON_NAMED_OUTPUT = "mapreduce.output.basename"; | ||
| public static final String CONTENT_TYPE = "Content-Type"; |
There was a problem hiding this comment.
Prefix the FS configuration key. Also, use "." notation. E.g. "io.cdap.gcs.batch.sink.content.type".
There was a problem hiding this comment.
This was changed to public static final String CONTENT_TYPE = "io.cdap.gcs.batch.sink.content.type"; in the updated PR.
|
@flakrimjusufi If multiple GCS sinks are being used in a single pipeline with different content type. What is the behavior? Any conflicts for content type? |
|
/gcbrun |
|
We did some tests with multiple GCS sink with different content types and the behavior was as expected. There were no conflicts for content type. The pipelines were able to run successfully. |
|
| public String getContentType() { | ||
| if (!Strings.isNullOrEmpty(contentType)) { | ||
| if (contentType.equals(CONTENT_TYPE_OTHER)) { | ||
| if (Strings.isNullOrEmpty(customContentType)) { |
There was a problem hiding this comment.
This means if customContentType is a macro, we use default content type?
There was a problem hiding this comment.
Same in here as well, if the value of customContentType is a macro, the value that was specified by the user is being used.
src/main/java/io/cdap/plugin/gcp/gcs/sink/GCSOutputFormatProvider.java
Outdated
Show resolved
Hide resolved
|
Integration tests: cdapio/cdap-integration-tests#1083 |
|
src/main/java/io/cdap/plugin/gcp/gcs/sink/RecordFilterOutputFormat.java
Outdated
Show resolved
Hide resolved
| } | ||
| } | ||
|
|
||
| if (containsMacro(NAME_CONTENT_TYPE)) { |
There was a problem hiding this comment.
no need for this check, by default if its a macro or not provided at configure time, it will be null
| failureCollector.addFailure(String.format( | ||
| "Valid content types for json are %s, %s", CONTENT_TYPE_APPLICATION_JSON, | ||
| CONTENT_TYPE_TEXT_PLAIN), | ||
| null |
There was a problem hiding this comment.
nit: move remaining to above line
src/main/java/io/cdap/plugin/gcp/gcs/sink/RecordFilterOutputFormat.java
Outdated
Show resolved
Hide resolved
src/main/java/io/cdap/plugin/gcp/gcs/sink/RecordFilterOutputFormat.java
Outdated
Show resolved
Hide resolved
src/main/java/io/cdap/plugin/gcp/gcs/sink/RecordFilterOutputFormat.java
Outdated
Show resolved
Hide resolved
| switch (format) { | ||
| case FORMAT_AVRO: | ||
| if (!contentType.equalsIgnoreCase(CONTENT_TYPE_APPLICATION_AVRO)) { | ||
| failureCollector.addFailure(String.format("Valid content type for avro is %s", |
There was a problem hiding this comment.
Please add period to the end of all the validation error messages
src/main/java/io/cdap/plugin/gcp/gcs/sink/GCSOutputCommitter.java
Outdated
Show resolved
Hide resolved
CuriousVini
left a comment
There was a problem hiding this comment.
a few comments. Please squash the commits after fixing.
Also please file a jira for GCS Multi sink plugin
src/test/java/io/cdap/plugin/gcp/gcs/sink/GCSBatchSinkTest.java
Outdated
Show resolved
Hide resolved
| } | ||
|
|
||
| @Nullable | ||
| @Description("Gets the value of content type") |
There was a problem hiding this comment.
this should be a javadoc not annotation. The javadoc should include mapping of format -> content type
39041a1 to
484b73f
Compare
The PR is updated according to the latest review. Commits are also squashed. Here you have the JIRA tikcket for GCS Multi Sink plugin: |
Add content-type to GCS sink
JIRA Ticket: https://cdap.atlassian.net/browse/PLUGIN-498