Nizar Hejazi
03/19/2022, 11:01 PMjava.lang.NullPointerException
when the value is null:
{
"columnName": "updatedAt_timestamp",
"transformFunction": "FromDateTime(updatedAt, 'yyyy-MM-dd''T''HH:mm:ss.SSS''Z''')"
}
Trying to use Groovy script as following but I see the following exception in the logs: MissingPropertyException: No such property: DateTimeFormat for class: Script1
{
"columnName": "col_timestamp",
"transformFunction": "Groovy({col == null ? null : DateTimeFormat.forPattern('yyyy-MM-dd\\'T\\'HH:mm:ss.SSS\\'Z\\'').withZone(DateTimeZone.forID(DateTimeZone.UTC.getID())).parseMillis(adminAccessGrantedOn)}, col)"
},
Do I need to import joda time classes to Groovy? Can I write a multi-line Groovy script as an ingestion transform?
Any other workaround to deal w/ nulls in FromDateTime inbuilt function? (I can submit a PR to update date time functions to handle nulls). Please note that I have "nullHandlingEnabled"
set to True.Nizar Hejazi
03/19/2022, 11:01 PM@Grab( 'joda-time:joda-time:2.10.5' )
import org.joda.time.DateTimeZone
import org.joda.time.format.DateTimeFormat
import org.joda.time.format.DateTimeFormatter
dt = '2022-03-19T11:00:18.789Z'
print DateTimeFormat.forPattern('yyyy-MM-dd\'T\'HH:mm:ss.SSS\'Z\'').withZone(DateTimeZone.forID(DateTimeZone.UTC.getID())).parseMillis(dt)
Mayank
Nizar Hejazi
03/19/2022, 11:05 PMNizar Hejazi
03/19/2022, 11:05 PMMayank
Kartik Khare
03/19/2022, 11:07 PMNizar Hejazi
03/19/2022, 11:08 PMNizar Hejazi
03/19/2022, 11:09 PM"nullHandlingEnabled"
set to True, why I cannot return null in an ingestion transform?Kartik Khare
03/19/2022, 11:10 PMnullHandlingEnabled
is set to true, whatever data you have uploaded, if the column is empty, we mark it as null in our index, the value is still stored as a default one.Nizar Hejazi
03/19/2022, 11:11 PMNizar Hejazi
03/19/2022, 11:12 PMNizar Hejazi
03/19/2022, 11:18 PMKartik Khare
03/19/2022, 11:19 PMNizar Hejazi
03/19/2022, 11:20 PMNizar Hejazi
03/19/2022, 11:22 PMFromDateTime
and ToDateTime
functions and return null in this case?
Source code: linkKartik Khare
03/19/2022, 11:25 PMNizar Hejazi
03/19/2022, 11:31 PMnullHandlingEnabled
is set to true, returns null. Otherwise, throw an exception.Kartik Khare
03/19/2022, 11:32 PMNizar Hejazi
03/19/2022, 11:49 PMnullHandlingEnabled
setting:
• If nullHandlingEnabled
is set to true, mark the value as null in the null index, and avoid calling the transform ingestion function.
• If nullHandlingEnabled
is set to false, call the transform ingestion function.Jackie
03/20/2022, 4:54 AMJackie
03/20/2022, 4:55 AMNizar Hejazi
03/20/2022, 9:42 PMNizar Hejazi
03/20/2022, 10:08 PMnullHandlingEnabled
to false?Nizar Hejazi
03/20/2022, 10:09 PMMayank
Jackie
03/21/2022, 5:15 PMnullHandlingEnabled
should be independent of the ingestion transform as it happens after the transform is done. We should preserve the null
values during the ingestion transformNizar Hejazi
03/22/2022, 1:00 AMMayank
Jackie
03/23/2022, 10:10 PMJackie
03/23/2022, 10:11 PMnullableParameters
is to annotate only the functions that can take null
and return something meaningful (e.g. isNull
)Jackie
03/23/2022, 10:12 PMfromDateTime
and dateTimeConvert
, they cannot handle null
input properly, so we should not annotate them and just let the InbuildFunctionEvaluator
evaluates the function to null
.Jackie
03/23/2022, 10:13 PMnull
values will be filled with default values, which is the expected behaviorNizar Hejazi
03/23/2022, 10:36 PMfromDateTime
and dateTimeConvert
w/ nullableParameters = true
.
Note: I annotated previously other datetime functions (e.g. ago, timeZoneHour, etc.) w/ this annotation since they could be used theoretically as ingestion transform w/ a column that has null values, but agree this is highly unlikely and hence I will remove the annotation. The probability of these functions’ input coming from a data column is almost zero.Jackie
03/23/2022, 10:45 PMfromDateTime
with nullableParameters = true
, null
will be passed into the function, and cause NPE since the scalar function does not handle null
properlyJackie
03/23/2022, 10:45 PMnull
when getting null
input value, so we should not annotate it with nullableParameters = true
Jackie
03/23/2022, 10:52 PMnull
check when nullableParameters
is false
instead of true
.Jackie
03/23/2022, 10:53 PMnullableParameters = true
, that means the function can take null
values, and we should pass the arguments as is; when it is not annotated, that means the function should not take null
values, and we should directly return null
when there is null
argument.Jackie
03/23/2022, 10:53 PMNizar Hejazi
03/23/2022, 10:58 PMnullableParameters = true
by default (functions can handle nulls by default) and override it for fromDateTime
and dateTimeConvert
(set to false) since both methods cannot handle nulls.Jackie
03/23/2022, 11:00 PMnullableParameters
. All existing scalar functions cannot handle null
Jackie
03/23/2022, 11:00 PMNizar Hejazi
03/23/2022, 11:02 PMNizar Hejazi
03/23/2022, 11:05 PMnullableParameters
set to false as you suggested, and only override it for JsonPathXXX functions.Jackie
03/23/2022, 11:05 PMJackie
03/23/2022, 11:06 PMJackie
03/23/2022, 11:09 PMtoJsonMapStr
, jsonFormat
, jsonPath*
with default value argumentNizar Hejazi
03/23/2022, 11:17 PMNizar Hejazi
03/23/2022, 11:32 PMjsonPathArray
cannot handle null. This is why we have jsonPathArrayDefaultEmpty
.Jackie
03/24/2022, 12:46 AMnon-null
value when input is null
Nizar Hejazi
03/24/2022, 8:37 PMJackie
03/24/2022, 8:41 PMJackie
03/24/2022, 8:43 PMNizar Hejazi
03/24/2022, 8:46 PM