Good to know <@U0138C9D4BE>. Another problem is th...
# troubleshooting
n
Good to know @Ranveer Singh. Another problem is the datatype for your DateTimeFieldSpecs. It should be STRING. I was able to generate sample data - the DateTimeFieldSpecs show up correctly
r
Neha, I see one problem. I dont see any dateTime column in metadata. Because of this i am not able to plot time on graphs
In SegmentsConfig , if i am mentioning this columns , i am getting classcast expception
n
which metadata?
so you dont see the date_time columns when you do select * ?
r
yes i can see that as column
Screenshot 2020-05-06 at 9.48.09 PM.png
left section on pinot Data Explorer
When i configure the table in SuperSet, i dont see the columns there
Screenshot 2020-05-06 at 9.50.20 PM.png
n
you see it in pinot data explorer left section? but not in superset?
strange, let me check
r
no not able to see in pinot data explorer as well
i was refering same as table metadata
but coming as query result in Pinot data Expoloreer
n
i think Pinot data explorer hides time columns. And as long as it is queryable i guess it doesn't matter. But regarding Superset, maybe the connector hasn't been written to understannd DateTime columns.
either case, can SuperSet understand time columns of the format like you have?
or does it need epoch millis?
r
let me check that as well
If i refer code of pinot, for dateTimeSpecs, "segmentsConfig": { "timeColumnName": "StatusRecModifyTS", "schemaName": "rfp", "replication": "1" },
timeColumnName is not supported
as that gives classCastException
n
yes, dateTimeFieldSpec will not work as time column. We are working to fix that.
but Superset shouldn't care if it is a time column or not in pinot
r
yes thats the problem..
i can see with some loaded example, date and dateTime supported in Superset
You have sample data yday which you validated, can you check whether you are able to see those columns
n
sure
r
Screenshot 2020-05-06 at 9.58.07 PM.png
n
what i mean is - pinot has 3 types of columns - Dimension, Metric, Time. Recently, we added DateTime. I think the SuperSer-Pinot connector has been written for only Dimension, Metric and Time. Maybe it is not reading DateTime.
r
got it..
This is what i was guessing..
Let me chekout the superset code.
n
can you try one thing - instead of putting the time columns in dateTimeFieldSpecs, put them as STRING in dimensions
r
Intially i did that but i need time series chart based on those metrics
n
they can be dimensions in pinot, you can mark them as temporal in Superset right?
SuperSet shouldnt care how the column is configured in Pinot
r
let me try what you are telling..
n
i confirmed that the Pinot-SuperSet connector ignores dateTimeFieldSpec 🙂 I will see if I can fix that, but until then, putting it as dimension is the only option
r
Thanks Neha.. Thanks for clarity..
n
is your data APPEND (append some data hourly/daily) or REFRESH (upload the whole snapshot data at once)?
r
we want to do in append mode..
batch i am just tryin to understand.. In real we have data ingestion directly from kafka
Again blocked on Temporal
using following date-time format:
Sun Apr 26 013801 UTC 2020 %a %b %d %H:%M:%S %Z %Y
but not working
n
you changed it to dimension? can you atleast see the column in SS this time?
something wrong with this pattern then
%a %b %d %H:%M:%S %Z %Y
?
r
yes , that is coming
for a given datetime , looks correct..
but not working in superset
n
when you say not working, what is exactly happening ?
I tried out with my sample data. It's not going to work because format is not supported by superset. If you see the comment below the "DateTime format" text box:
Copy code
The pattern of timestamp format. For strings use python datetime string pattern expression which needs to adhere to the ISO 8601 standard to ensure that the lexicographical ordering coincides with the chronological ordering. If the timestamp format does not adhere to the ISO 8601 standard you will need to define an expression and type for transforming the string into a date or timestamp. Note currently time zones are not supported. If time is stored in epoch format, put epoch_s or epoch_ms.
"needs to adhere to ISO 8601 standard". The format you have is not ISO standard: https://en.wikipedia.org/wiki/ISO_8601
Here's a suggestion: Convert the values to millis while creating segment, using Groovy transform functions. Here's the schema i used to do this
Copy code
{
  "schemaName": "rfp",
  "dimensionFieldSpecs": [
    {
      "name": "status",
      "dataType": "STRING"
    },
    {
      "name": "fulfilmentType",
      "dataType": "STRING"
    },
    {
      "name": "soOrderHeaderKey",
      "dataType": "STRING"
    },
    {
      "name": "SONumber",
      "dataType": "STRING"
    },
    {
      "name": "CommsResponse",
      "dataType": "INT"
    },
    {
      "name": "extnOriginalNo",
      "dataType": "INT"
    },
    {
      "name": "messageId",
      "dataType": "STRING"
    },
    {
      "name": "orderLineKey",
      "dataType": "STRING"
    },
    {
      "name": "fulfilmentSubType",
      "dataType": "STRING"
    },
    {
      "name": "storeId",
      "dataType": "STRING"
    },
    {
      "name": "soOrderLineKey",
      "dataType": "STRING"
    },
    {
      "name": "primeLineNumber",
      "dataType": "STRING"
    },
    {
      "name": "PONumber",
      "dataType": "STRING"
    },
    {
      "name": "itemId",
      "dataType": "STRING"
    },
    {
      "name": "orderHeaderKey",
      "dataType": "STRING"
    },
    {
      "name": "releaseStatusKey",
      "dataType": "STRING"
    },
    {
      "name": "RFP",
      "dataType": "STRING"
    },
    {
      "name": "EmailAck",
      "dataType": "STRING"
    },
    {
      "name": "StatusRecModifyMillis",
      "dataType": "LONG",
      "transformFunction": "Groovy({new Date().parse('EEE MMM dd HH:mm:ss z yyyy', StatusRecModifyTS).getTime()}, StatusRecModifyTS)"
    },
    {
      "name": "StatusRecCreateMillis",
      "dataType": "LONG",
      "transformFunction": "Groovy({new Date().parse('EEE MMM dd HH:mm:ss z yyyy', StatusRecCreateTS).getTime()}, StatusRecCreateTS)"
    },
    {
      "name": "EmailSendCreateMillis",
      "dataType": "LONG",
      "transformFunction": "Groovy({new Date().parse('EEE MMM dd HH:mm:ss z yyyy', EmailSendCreate).getTime()}, EmailSendCreate)"
    }
  ],
  "metricFieldSpecs": [
    {
      "name": "TimeTaken",
      "dataType": "LONG"
    }
  ]
}
LMK if this works out!
r
Looks like issue with CSV reader. It is uploading -9223372036854776000 which id default Long value for these columns. I tried running this ExpressionTransformerTest with adding this as usecase that is working fine
Copy code
@Test
public void testDateTransformFromString() {
  Schema pinotSchema = new Schema();
  DimensionFieldSpec dimensionFieldSpec = new DimensionFieldSpec("StatusRecCreateTSInMillis", FieldSpec.DataType.LONG, true);
  dimensionFieldSpec.setTransformFunction("Groovy({new Date().parse('EEE MMM dd HH:mm:ss z yyyy', StatusRecModifyTS).getTime()}, StatusRecModifyTS)");

  pinotSchema.addField(dimensionFieldSpec);
  ExpressionTransformer expressionTransformer = new ExpressionTransformer(pinotSchema);

  GenericRow genericRow = new GenericRow();
  genericRow.putValue("StatusRecModifyTS", "Sun Apr 26 01:38:01 UTC 2020");
 // genericRow.putValue("StatusRecCreateTSInMillis", "1587865081000");


  // no transformation
  expressionTransformer.transform(genericRow);
  Assert.assertEquals(genericRow.getValue("StatusRecCreateTSInMillis"), 1587865081000L);

 /* pinotSchema = new Schema();
  TimeFieldSpec timeFieldSpec = new TimeFieldSpec(new TimeGranularitySpec(FieldSpec.DataType.LONG, TimeUnit.MILLISECONDS, "incoming"), new TimeGranularitySpec(
          <http://FieldSpec.DataType.INT|FieldSpec.DataType.INT>, TimeUnit.DAYS, "outgoing"));
  pinotSchema.addField(timeFieldSpec);
  expressionTransformer = new ExpressionTransformer(pinotSchema);

  genericRow = new GenericRow();
  genericRow.putValue("incoming", "123456789");
  genericRow.putValue("outgoing", "123");

  // no transformation
  expressionTransformer.transform(genericRow);
  Assert.assertEquals(genericRow.getValue("outgoing"), "123");*/
}
looks like CSVReader ignoring the expression..My guess
n
Are you using latest code, or the release?
This is not available in release
It worked for me. I was using latest code
r
I am using the latest code.. And deployed locally
but i build last week..'
let me take the latest and rebuild and try
n
hmm last week's code should have had it.. anyway, latest will work. even i tried using docker which is 1 week ago's code, and it didnt work for me. When i moved to latest code locally, it worked
r
let me check..
do you any specific branch where latest code are there
n
no, just the latest master
r
i see i have taken master
ok thanks
n
did it work?
r
Yes it worked.. Thank you for late response