Gerrit van Doorn
10/24/2022, 10:22 PMMayank
Gerrit van Doorn
10/24/2022, 11:59 PMSending request: <http://foo.bar.company.com:9000/v2/segments?tableName=events> to controller: <http://foo.bar.company.com|foo.bar.company.com>, version: Unknown
Response for pushing table events segment events_OFFLINE_1606134538000000000_1665361807000000000_1 to location <http://foo.bar.company.com:9000> - 200: {"status":"Successfully uploaded segment: events_OFFLINE_1606134538000000000_1665361807000000000_1 of table: events_OFFLINE"}
8 times (1 for each file)
Pressing enter doesn’t do anything, I have to use CTRL+C to have it exitGerrit van Doorn
10/25/2022, 12:00 AMMayank
Gerrit van Doorn
10/25/2022, 12:00 AMHaitao Zhang
10/25/2022, 1:08 AMHaitao Zhang
10/25/2022, 1:09 AMGerrit van Doorn
10/25/2022, 2:19 AMincludeFileNamePattern: 'glob:**/partition-000*.recordio'
3. I did not enable debug. What is the best way to capture all? I’m currently using -Dlog4j2.configurationFile=conf/log4j2.xml
with the defaultsMayank
Gerrit van Doorn
10/25/2022, 2:20 AMGerrit van Doorn
10/25/2022, 2:59 AMHaitao Zhang
10/25/2022, 3:00 AMGerrit van Doorn
10/25/2022, 3:01 AMHaitao Zhang
10/25/2022, 3:15 AMHaitao Zhang
10/25/2022, 3:16 AMis it possible that you share the ingestionJobSpec?I can try your spec if you can share
Gerrit van Doorn
10/25/2022, 3:18 AMexecutionFrameworkSpec:
name: 'standalone'
segmentGenerationJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner'
segmentTarPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner'
segmentUriPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentUriPushJobRunner'
segmentMetadataPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentMetadataPushJobRunner'
# Recommended to set jobType to SegmentCreationAndMetadataPush for production environment where Pinot Deep Store is configured
#jobType: SegmentCreation
#jobType: SegmentTarPush
#jobType: SegmentUriPush
# jobType: SegmentMetadataPush
#jobType: SegmentCreationAndTarPush
#jobType: SegmentCreationAndUriPush
jobType: SegmentCreationAndMetadataPush
# NOTE: use "-" instead of "_"
inputDirURI: '<os://FEED-STAGE/day=2022-10-10/hour=00>'
includeFileNamePattern: 'glob:**/partition-000*.recordio'
outputDirURI: '<s3://pinot-deep-store/segments/offline>'
overwriteOutput: true
pinotFSSpecs:
- scheme: os
className: com.company.pinot.plugin.filesystem.object_store.ObjectStorePinotFS
- scheme: s3
className: org.apache.pinot.plugin.filesystem.S3PinotFS
configs:
region: us-west-2
- scheme: file
className: org.apache.pinot.spi.filesystem.LocalPinotFS
recordReaderSpec:
dataFormat: 'recordio'
className: 'com.company.pinot.plugin.inputformat.recordio.RecordIORecordReader'
tableSpec:
tableName: 'events'
pinotClusterSpecs:
- controllerURI: '<http://pinot-controller.company.com:9000>'
pushJobSpec:
pushAttempts: 2
pushRetryIntervalMillis: 1000
Gerrit van Doorn
10/25/2022, 3:25 AMSegmentCreationAndMetadataPush
first upload the data to deepstore, then download it again to fetch metadata? That seems inefficient considering it just was processing that same dataHaitao Zhang
10/25/2022, 3:31 AMbtw does theRight, that's something we can improvefirst upload the data to deepstore, then download it again to fetch metadata? That seems inefficient considering it just was processing that same dataSegmentCreationAndMetadataPush
Haitao Zhang
10/25/2022, 3:34 AMStart pushing segment metadata:
?Haitao Zhang
10/25/2022, 3:36 AMsegmentMetadataFile
Haitao Zhang
10/25/2022, 3:38 AMGerrit van Doorn
10/25/2022, 3:40 AMHaitao Zhang
10/25/2022, 3:41 AMGerrit van Doorn
10/25/2022, 10:09 PM2022-10-25 21:35:57 DEBUG amazon.awssdk.utils.Logger:84 - Received successful response: 200
2022-10-25 21:35:57 DEBUG amazon.awssdk.utils.Logger:84 - Interceptor 'software.amazon.awssdk.services.s3.internal.handlers.EnableTrailingChecksumInterceptor@33e0c716' modified the message with its modifyResponse method.
2022-10-25 21:35:59 DEBUG apache.http.impl.conn.PoolingHttpClientConnectionManager:344 - Connection [id: 5][route: {tls}-><http://proxy-batch.foo.company.com:3128>-><https://project-pinot-deep-store.s3.us-west-2.amazonaws.com:443>] can be kept alive for 60.0 seconds
2022-10-25 21:35:59 DEBUG apache.http.impl.conn.LoggingManagedHttpClientConnection:88 - http-outgoing-5: set socket timeout to 0
2022-10-25 21:35:59 DEBUG apache.http.impl.conn.PoolingHttpClientConnectionManager:351 - Connection released: [id: 5][route: {tls}-><http://proxy-batch.foo.company.com:3128>-><https://project-pinot-deep-store.s3.us-west-2.amazonaws.com:443>][total available: 1; route allocated: 1 of 50; total allocated: 1 of 50]
2022-10-25 21:35:59 INFO apache.pinot.segment.local.utils.SegmentPushUtils:352 - Trying to untar Metadata file from: [/tmp/segmentTar-8535c540-b8a9-495a-96eb-64852b3646b7.tar.gz] to [/tmp/segmentMetadataDir-8535c540-b8a9-495a-96eb-64852b3646b7]
2022-10-25 21:35:59 INFO apache.pinot.segment.local.utils.SegmentPushUtils:357 - Trying to untar CreationMeta file from: [/tmp/segmentTar-8535c540-b8a9-495a-96eb-64852b3646b7.tar.gz] to [/tmp/segmentMetadataDir-8535c540-b8a9-495a-96eb-64852b3646b7]
2022-10-25 21:35:59 INFO apache.pinot.segment.local.utils.SegmentPushUtils:366 - Trying to tar segment metadata dir [/tmp/segmentMetadataDir-8535c540-b8a9-495a-96eb-64852b3646b7] to [/tmp/segmentMetadata-8535c540-b8a9-495a-96eb-64852b3646b7.tar.gz]
2022-10-25 21:35:59 INFO apache.pinot.segment.local.utils.SegmentPushUtils:252 - Pushing segment: events_OFFLINE_1606134538000000000_1665361807000000000_1 to location: <http://foo8d-rg1-36a.foo.company.com:9000> for table events
2022-10-25 21:35:59 DEBUG apache.pinot.shaded.org.apache.http.client.protocol.RequestAddCookies:123 - CookieSpec selected: default
2022-10-25 21:35:59 DEBUG apache.pinot.shaded.org.apache.http.client.protocol.RequestAuthCache:77 - Auth cache not set in the context
2022-10-25 21:35:59 DEBUG apache.pinot.shaded.org.apache.http.impl.conn.PoolingHttpClientConnectionManager:267 - Connection request: [route: {}-><http://foo8d-rg1-36a.foo.company.com:9000>][total available: 1; route allocated: 1 of 2; total allocated: 1 of 20]
2022-10-25 21:35:59 DEBUG apache.pinot.shaded.org.apache.http.impl.conn.PoolingHttpClientConnectionManager:312 - Connection leased: [id: 0][route: {}-><http://foo8d-rg1-36a.foo.company.com:9000>][total available: 0; route allocated: 1 of 2; total allocated: 1 of 20]
2022-10-25 21:35:59 DEBUG apache.pinot.shaded.org.apache.http.impl.conn.LoggingManagedHttpClientConnection:88 - http-outgoing-0: set socket timeout to 0
2022-10-25 21:35:59 DEBUG apache.pinot.shaded.org.apache.http.impl.conn.LoggingManagedHttpClientConnection:88 - http-outgoing-0: set socket timeout to 600000
2022-10-25 21:35:59 DEBUG apache.pinot.shaded.org.apache.http.impl.execchain.MainClientExec:255 - Executing request POST /v2/segments?tableName=events HTTP/1.1
2022-10-25 21:35:59 DEBUG apache.pinot.shaded.org.apache.http.impl.execchain.MainClientExec:260 - Target auth state: UNCHALLENGED
2022-10-25 21:35:59 DEBUG apache.pinot.shaded.org.apache.http.impl.execchain.MainClientExec:266 - Proxy auth state: UNCHALLENGED
2022-10-25 21:35:59 DEBUG apache.pinot.shaded.org.apache.http.impl.conn.LoggingManagedHttpClientConnection:133 - http-outgoing-0 >> POST /v2/segments?tableName=events HTTP/1.1
2022-10-25 21:35:59 DEBUG apache.pinot.shaded.org.apache.http.impl.conn.LoggingManagedHttpClientConnection:136 - http-outgoing-0 >> DOWNLOAD_URI: <s3://project-pinot-deep-store/segments/offline/events_OFFLINE_1606134538000000000_1665361807000000000_1.tar.gz>
2022-10-25 21:35:59 DEBUG apache.pinot.shaded.org.apache.http.impl.conn.LoggingManagedHttpClientConnection:136 - http-outgoing-0 >> UPLOAD_TYPE: METADATA
2022-10-25 21:35:59 DEBUG apache.pinot.shaded.org.apache.http.impl.conn.LoggingManagedHttpClientConnection:136 - http-outgoing-0 >> COPY_SEGMENT_TO_DEEP_STORE: false
2022-10-25 21:35:59 DEBUG apache.pinot.shaded.org.apache.http.impl.conn.LoggingManagedHttpClientConnection:136 - http-outgoing-0 >> Content-Length: 8785
2022-10-25 21:35:59 DEBUG apache.pinot.shaded.org.apache.http.impl.conn.LoggingManagedHttpClientConnection:136 - http-outgoing-0 >> Content-Type: multipart/form-data; boundary=IBKhI--_wIiCbIqMX5Vonk1I9tdXVGFgYcrP3
2022-10-25 21:35:59 DEBUG apache.pinot.shaded.org.apache.http.impl.conn.LoggingManagedHttpClientConnection:136 - http-outgoing-0 >> Host: <http://foo8d-rg1-36a.foo.company.com:9000|foo8d-rg1-36a.foo.company.com:9000>
2022-10-25 21:35:59 DEBUG apache.pinot.shaded.org.apache.http.impl.conn.LoggingManagedHttpClientConnection:136 - http-outgoing-0 >> Connection: Keep-Alive
2022-10-25 21:35:59 DEBUG apache.pinot.shaded.org.apache.http.impl.conn.LoggingManagedHttpClientConnection:136 - http-outgoing-0 >> User-Agent: Apache-HttpClient/4.5.13 (Java/1.8.0_342)
2022-10-25 21:35:59 DEBUG apache.pinot.shaded.org.apache.http.impl.conn.LoggingManagedHttpClientConnection:136 - http-outgoing-0 >> Accept-Encoding: gzip,deflate
2022-10-25 21:35:59 DEBUG apache.pinot.shaded.org.apache.http.impl.conn.LoggingManagedHttpClientConnection:122 - http-outgoing-0 << HTTP/1.1 200 OK
2022-10-25 21:35:59 DEBUG apache.pinot.shaded.org.apache.http.impl.conn.LoggingManagedHttpClientConnection:125 - http-outgoing-0 << Pinot-Controller-Host: <http://foo8d-rg1-36a.foo.company.com|foo8d-rg1-36a.foo.company.com>
2022-10-25 21:35:59 DEBUG apache.pinot.shaded.org.apache.http.impl.conn.LoggingManagedHttpClientConnection:125 - http-outgoing-0 << Pinot-Controller-Version: Unknown
2022-10-25 21:35:59 DEBUG apache.pinot.shaded.org.apache.http.impl.conn.LoggingManagedHttpClientConnection:125 - http-outgoing-0 << Access-Control-Allow-Origin: *
2022-10-25 21:35:59 DEBUG apache.pinot.shaded.org.apache.http.impl.conn.LoggingManagedHttpClientConnection:125 - http-outgoing-0 << Content-Type: application/json
2022-10-25 21:35:59 DEBUG apache.pinot.shaded.org.apache.http.impl.conn.LoggingManagedHttpClientConnection:125 - http-outgoing-0 << Content-Length: 125
2022-10-25 21:35:59 DEBUG apache.pinot.shaded.org.apache.http.impl.execchain.MainClientExec:285 - Connection can be kept alive indefinitely
2022-10-25 21:35:59 INFO apache.pinot.common.utils.http.HttpClient:280 - Sending request: <http://foo8d-rg1-36a.foo.company.com:9000/v2/segments?tableName=events> to controller: <http://foo8d-rg1-36a.foo.company.com|foo8d-rg1-36a.foo.company.com>, version: Unknown
2022-10-25 21:35:59 DEBUG apache.pinot.shaded.org.apache.http.impl.conn.PoolingHttpClientConnectionManager:344 - Connection [id: 0][route: {}-><http://foo8d-rg1-36a.foo.company.com:9000>] can be kept alive indefinitely
2022-10-25 21:35:59 DEBUG apache.pinot.shaded.org.apache.http.impl.conn.LoggingManagedHttpClientConnection:88 - http-outgoing-0: set socket timeout to 0
2022-10-25 21:35:59 DEBUG apache.pinot.shaded.org.apache.http.impl.conn.PoolingHttpClientConnectionManager:351 - Connection released: [id: 0][route: {}-><http://foo8d-rg1-36a.foo.company.com:9000>][total available: 1; route allocated: 1 of 2; total allocated: 1 of 20]
2022-10-25 21:35:59 INFO apache.pinot.segment.local.utils.SegmentPushUtils:277 - Response for pushing table events segment events_OFFLINE_1606134538000000000_1665361807000000000_1 to location <http://foo8d-rg1-36a.foo.company.com:9000> - 200: {"status":"Successfully uploaded segment: events_OFFLINE_1606134538000000000_1665361807000000000_1 of table: events_OFFLINE"}
2022-10-25 21:36:35 DEBUG apache.http.impl.conn.PoolingHttpClientConnectionManager:441 - Closing connections idle longer than 60000 MILLISECONDS
2022-10-25 21:36:35 DEBUG apache.http.impl.conn.PoolingHttpClientConnectionManager:441 - Closing connections idle longer than 60000 MILLISECONDS
2022-10-25 21:37:35 DEBUG apache.http.impl.conn.PoolingHttpClientConnectionManager:441 - Closing connections idle longer than 60000 MILLISECONDS
2022-10-25 21:37:35 DEBUG apache.http.impl.conn.LoggingManagedHttpClientConnection:79 - http-outgoing-4: Close connection
2022-10-25 21:37:35 DEBUG apache.http.impl.conn.PoolingHttpClientConnectionManager:441 - Closing connections idle longer than 60000 MILLISECONDS
2022-10-25 21:37:35 DEBUG apache.http.impl.conn.LoggingManagedHttpClientConnection:79 - http-outgoing-5: Close connection
2022-10-25 21:38:35 DEBUG apache.http.impl.conn.PoolingHttpClientConnectionManager:441 - Closing connections idle longer than 60000 MILLISECONDS
2022-10-25 21:38:35 DEBUG apache.http.impl.conn.PoolingHttpClientConnectionManager:441 - Closing connections idle longer than 60000 MILLISECONDS
2022-10-25 21:39:35 DEBUG apache.http.impl.conn.PoolingHttpClientConnectionManager:441 - Closing connections idle longer than 60000 MILLISECONDS
2022-10-25 21:39:35 DEBUG apache.http.impl.conn.PoolingHttpClientConnectionManager:441 - Closing connections idle longer than 60000 MILLISECONDS
2022-10-25 21:40:35 DEBUG apache.http.impl.conn.PoolingHttpClientConnectionManager:441 - Closing connections idle longer than 60000 MILLISECONDS
2022-10-25 21:40:35 DEBUG apache.http.impl.conn.PoolingHttpClientConnectionManager:441 - Closing connections idle longer than 60000 MILLISECONDS
2022-10-25 21:41:35 DEBUG apache.http.impl.conn.PoolingHttpClientConnectionManager:441 - Closing connections idle longer than 60000 MILLISECONDS
2022-10-25 21:41:35 DEBUG apache.http.impl.conn.PoolingHttpClientConnectionManager:441 - Closing connections idle longer than 60000 MILLISECONDS
2022-10-25 21:42:35 DEBUG apache.http.impl.conn.PoolingHttpClientConnectionManager:441 - Closing connections idle longer than 60000 MILLISECONDS
2022-10-25 21:42:35 DEBUG apache.http.impl.conn.PoolingHttpClientConnectionManager:441 - Closing connections idle longer than 60000 MILLISECONDS
2022-10-25 21:43:35 DEBUG apache.http.impl.conn.PoolingHttpClientConnectionManager:441 - Closing connections idle longer than 60000 MILLISECONDS
etc.
Haitao Zhang
10/25/2022, 10:13 PMGerrit van Doorn
10/25/2022, 10:15 PM/tmp/pinot-<some-sha>
and is removedGerrit van Doorn
10/25/2022, 10:52 PM$ netstat -natp | grep 31452
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp6 0 0 10.60.96.26:52788 10.112.145.226:5100 ESTABLISHED 31452/java
tcp6 0 0 10.60.96.26:38794 10.115.108.35:5100 ESTABLISHED 31452/java
tcp6 0 0 10.60.96.26:19296 10.113.174.10:5100 ESTABLISHED 31452/java
tcp6 0 0 10.60.96.26:18836 10.112.7.194:5100 ESTABLISHED 31452/java
tcp6 1 0 10.60.96.26:25764 10.60.96.26:9000 CLOSE_WAIT 31452/java
tcp6 0 0 127.0.0.1:41196 127.0.0.1:4987 ESTABLISHED 31452/java
tcp6 0 0 10.60.96.26:36692 10.113.37.2:5100 ESTABLISHED 31452/java
tcp6 0 0 10.60.96.26:24010 10.113.119.34:5100 ESTABLISHED 31452/java
There is a CLOSE_WAIT socketGerrit van Doorn
10/25/2022, 10:58 PMss -tap state CLOSE-WAIT
I still see 9 ports in the CLOSE-WAIT
stateGerrit van Doorn
10/25/2022, 11:01 PMHaitao Zhang
10/25/2022, 11:37 PMHaitao Zhang
10/25/2022, 11:39 PMThe temp data is written tojust minor thing (may not be related), the folder isand is removed/tmp/pinot-<some-sha>
/<prefix>/segmentMetadata-<some sha>
Gerrit van Doorn
10/28/2022, 7:10 PMNettyChannelBuilder b = ...
b.build()
once that happens (and not even using it) the program will not exit.
Maybe this is due to different version of Netty being used in the same application (Pinot’s version and our version, which is older)?Haitao Zhang
10/28/2022, 8:18 PM