Sergii Mikhtoniuk
01/06/2023, 12:36 AMParquetRowInputFormat
is gone...
The ParquetColumnarRowInputFormat
gives me DataStream[RowData]
which I struggle a lot with:
- I cannot stream.print()
it even, as toString
is not defined for RowData
- I cannot plug it into Table API, as it results in a single column f0 RAW(...)
even though I properly define types and names of each field when constructing ParquetColumnarRowInputFormat
I've seen suggestions to read Parquet via Table API, but in my case I need more low-level control over which files are read and in which order to then combine them into a single stream.
Qs:
- Is there a correct way to create a table from DataStream[RowData]
?
- If not, can I convert DataStream[RowData]
into DataStream[Row]
(I don't care too much about efficiency atm)?Nitin Agrawal
01/06/2023, 3:07 AMHybrid Source
implementation. Is it present, if not what is the recommendation to solve Hybrid Source
use case for Table API connectors.Suparn Lele
01/06/2023, 4:53 AMChen-Che Huang
01/06/2023, 10:24 AMflink-k8s-operator 1.3
to deploy our Flink apps in k8s cluster. Today, in one of our clusters, we encounter the following error. We expect this issue is fixed in link-k8s-operator 1.2
because of https://issues.apache.org/jira/browse/FLINK-28272. We try to delete webhook-server-cert
secret and restart the operator pod. But the issue still happens. I found someone encountered the same issue about two months ago and am not sure whether there’re more users having the same issue. Any comment is appreciated 🙏
one or more objects failed to apply, reason: Internal error occurred: failed calling webhook "<http://validationwebhook.flink.apache.org|validationwebhook.flink.apache.org>": failed to call webhook: Post "<https://flink-operator-webhook-service.flink-operator.svc:443/validate?timeout=10s>": x509: certificate signed by unknown authority (possibly because of "x509: invalid signature: parent certificate cannot sign this kind of certificate" while trying to verify candidate authority certificate "FlinkDeployment Validator")
Lucas Alcântara Freire
01/06/2023, 12:41 PMinput.filter(someFiltering())
.keyBy(e -> e.getId())
.map(new Tracing<>("ResourceName")) // starts the datadog span
.flatMap(someWork())
.process(someMoreWork())
.addSink(saveInDB())
.name("cool name");
Tracing map method
@Override
public IN map(IN in) throws Exception {
final Span span = GlobalTracer.get().buildSpan("job.processing")
.withTag(DDTags.RESOURCE_NAME, this.resourceName)
.start();
try(final Scope scope = GlobalTracer.get().activateSpan(span)){
return in;
} finally {
span.finish();
}
}
sumit gulati
01/06/2023, 12:43 PMCaused by: org.postgresql.util.PSQLException: The server requested SCRAM-based authentication, but no password was provided.
I'm sinking the stream to Postgresql using JDBCsink connector, the connector is able to take password from Config.properties file and is working fine in local (pushing data to postgres) but when deploying, its unable to initialise the sink part and causing issue.Mehul Batra
01/06/2023, 4:39 PMNick Caballero
01/06/2023, 5:27 PMFariz Hajiyev
01/06/2023, 6:37 PMSlackbot
01/06/2023, 11:42 PMXi Cheng
01/07/2023, 5:26 AMRoundRobinAssignor
instead of the default RangeAssignor
, some task managers don't get any partitions assigned while some task managers get a lot (5) partitions assigned, we are not sure what lead to the imbalance of Task manager Kafka partition assignment.Andy Chambers
01/07/2023, 5:24 PMRafael Jeon
01/08/2023, 11:32 AMAviv Dozorets
01/08/2023, 2:01 PMRUNNING
to anything else, especially if it’s failing ? So far I couldn’t get to trigger onJobExecuted
when it’s cancelled or failed.Eric Liu
01/09/2023, 6:45 AMblob.fetch.num-concurrent
to 200 from 50 but the error still occurs.
2023-01-09 06:26:28,780 ERROR org.apache.flink.runtime.blob.BlobServerConnection [] - Error while executing BLOB connection.
java.io.IOException: Unknown operation 71
at org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:116) [flink-dist-1.15.3.jar:1.15.3]
Tiansu Yu
01/09/2023, 10:41 AMRecordWiseFileCompactor
for parquet files? (flink 1.15)
I find the documentation https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/connectors/datastream/filesystem/#compaction on compactions lacks explanation / example on how to create a custom`CompactingFileReader?` .
Though there is one example I found https://stackoverflow.com/questions/72716885/use-of-compaction-for-parquet-bulk-format, the solution looks a bit verbose. It seems that you have to implement two interfaces in order to let the RecordWiseFileCompactor
to read a parquet file. Wonder if there is simpler ways to achieve this? E.g.
1. A direct wrapper of FileInputFormat
on top of AvroParquetReader
lying somewhere in Flink? (I remember in 1.13 there was something like ParquetAvroInputFormat
for generic records?)
2. One should be able to supply FileInputFormat
into InputFormatBasedReader.Factory
without wrapping it inside SerializableSupplierWithException
. Is this possible?Luis Calado
01/09/2023, 11:55 AMCould not find any factory for identifier 'iceberg' that implements 'org.apache.flink.table.factories.DynamicTableFactory'
I've added the dependencies iceberg-flink-runtime
, iceberg-flink
and flink-sql-connector-hive
.
Using version 1.14.2Tawfik Yasser
01/09/2023, 4:06 PMmgu
01/09/2023, 4:11 PMGaurav Miglani
01/09/2023, 5:50 PM.uncompacted-
, checked flink code of compact operator (https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-files[…]/flink/connector/file/table/stream/compact/CompactOperator.java), at checkpoint files are getting committed on s3 with .uncompacted-part-uuid
, I'm trying to create athena table on top of it, but seems like athena ignore files starting with .
, is there any way i can resolve this, is there any way i can take prefix from config, also after enable compaction still some files names in s3 are .uncompacted-part-uuid
, is this the default behaviour 🤔Ashutosh Joshi
01/09/2023, 7:19 PMMaryam
01/09/2023, 7:31 PMTable Aggregate Function
udf function that emit value on a window or after a specified number of input elements have arrived. is this possible in Flink Table API?Adriana Beltran
01/10/2023, 12:47 AMmake_protobuf_type
as a function that creates a Statefun type that is backed by Proto.Stephan Helas
01/10/2023, 9:26 AMV N Rahul Bharadwaj Tumpala
01/10/2023, 12:39 PMAmol Khare
01/10/2023, 3:05 PMTiansu Yu
01/10/2023, 4:40 PMAvroParquetReader
who in return uses HadoopInputFile
inside my Flink Application. This is run inside a local Flink-1.15 cluster which has added flink-s3-hadoop-fs inside the plugins folder. This has run normally until I have added a file compactor inside, where I need to use AvroParquetReader
to read Parquet files. This time, it complains that java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found
. For hadoop, I have used a locally installed hadoop-2.10, and added its classpath along side the hadoop plugin jar file to HADOOP_CLASSPATH. So I really dont see why hadoop could not find a S3AFileSystem to use.Ruslan Danilin
01/10/2023, 9:40 PMJeremy Ber
01/10/2023, 9:45 PM.watermark(column, expression)?
sharad mishra
01/10/2023, 9:51 PM