Nizar Hejazi
03/19/2022, 11:01 PMjava.lang.NullPointerException
when the value is null:
{
"columnName": "updatedAt_timestamp",
"transformFunction": "FromDateTime(updatedAt, 'yyyy-MM-dd''T''HH:mm:ss.SSS''Z''')"
}
Trying to use Groovy script as following but I see the following exception in the logs: MissingPropertyException: No such property: DateTimeFormat for class: Script1
{
"columnName": "col_timestamp",
"transformFunction": "Groovy({col == null ? null : DateTimeFormat.forPattern('yyyy-MM-dd\\'T\\'HH:mm:ss.SSS\\'Z\\'').withZone(DateTimeZone.forID(DateTimeZone.UTC.getID())).parseMillis(adminAccessGrantedOn)}, col)"
},
Do I need to import joda time classes to Groovy? Can I write a multi-line Groovy script as an ingestion transform?
Any other workaround to deal w/ nulls in FromDateTime inbuilt function? (I can submit a PR to update date time functions to handle nulls). Please note that I have "nullHandlingEnabled"
set to True.@Grab( 'joda-time:joda-time:2.10.5' )
import org.joda.time.DateTimeZone
import org.joda.time.format.DateTimeFormat
import org.joda.time.format.DateTimeFormatter
dt = '2022-03-19T11:00:18.789Z'
print DateTimeFormat.forPattern('yyyy-MM-dd\'T\'HH:mm:ss.SSS\'Z\'').withZone(DateTimeZone.forID(DateTimeZone.UTC.getID())).parseMillis(dt)
Mayank
Nizar Hejazi
03/19/2022, 11:05 PMKartik Khare
03/19/2022, 11:07 PMNizar Hejazi
03/19/2022, 11:08 PM"nullHandlingEnabled"
set to True, why I cannot return null in an ingestion transform?Kartik Khare
03/19/2022, 11:10 PMnullHandlingEnabled
is set to true, whatever data you have uploaded, if the column is empty, we mark it as null in our index, the value is still stored as a default one.Nizar Hejazi
03/19/2022, 11:11 PMKartik Khare
03/19/2022, 11:19 PMNizar Hejazi
03/19/2022, 11:20 PMFromDateTime
and ToDateTime
functions and return null in this case?
Source code: linkKartik Khare
03/19/2022, 11:25 PMNizar Hejazi
03/19/2022, 11:31 PMnullHandlingEnabled
is set to true, returns null. Otherwise, throw an exception.Kartik Khare
03/19/2022, 11:32 PMNizar Hejazi
03/19/2022, 11:49 PMnullHandlingEnabled
setting:
• If nullHandlingEnabled
is set to true, mark the value as null in the null index, and avoid calling the transform ingestion function.
• If nullHandlingEnabled
is set to false, call the transform ingestion function.Jackie
03/20/2022, 4:54 AMNizar Hejazi
03/20/2022, 9:42 PMnullHandlingEnabled
to false?Mayank
Jackie
03/21/2022, 5:15 PMnullHandlingEnabled
should be independent of the ingestion transform as it happens after the transform is done. We should preserve the null
values during the ingestion transformNizar Hejazi
03/22/2022, 1:00 AMMayank
Jackie
03/23/2022, 10:10 PMnullableParameters
is to annotate only the functions that can take null
and return something meaningful (e.g. isNull
)fromDateTime
and dateTimeConvert
, they cannot handle null
input properly, so we should not annotate them and just let the InbuildFunctionEvaluator
evaluates the function to null
.null
values will be filled with default values, which is the expected behaviorNizar Hejazi
03/23/2022, 10:36 PMfromDateTime
and dateTimeConvert
w/ nullableParameters = true
.
Note: I annotated previously other datetime functions (e.g. ago, timeZoneHour, etc.) w/ this annotation since they could be used theoretically as ingestion transform w/ a column that has null values, but agree this is highly unlikely and hence I will remove the annotation. The probability of these functions’ input coming from a data column is almost zero.Jackie
03/23/2022, 10:45 PMfromDateTime
with nullableParameters = true
, null
will be passed into the function, and cause NPE since the scalar function does not handle null
properlynull
when getting null
input value, so we should not annotate it with nullableParameters = true
null
check when nullableParameters
is false
instead of true
.nullableParameters = true
, that means the function can take null
values, and we should pass the arguments as is; when it is not annotated, that means the function should not take null
values, and we should directly return null
when there is null
argument.Nizar Hejazi
03/23/2022, 10:58 PMnullableParameters = true
by default (functions can handle nulls by default) and override it for fromDateTime
and dateTimeConvert
(set to false) since both methods cannot handle nulls.Jackie
03/23/2022, 11:00 PMnullableParameters
. All existing scalar functions cannot handle null
Nizar Hejazi
03/23/2022, 11:02 PMnullableParameters
set to false as you suggested, and only override it for JsonPathXXX functions.Jackie
03/23/2022, 11:05 PMtoJsonMapStr
, jsonFormat
, jsonPath*
with default value argumentNizar Hejazi
03/23/2022, 11:17 PMjsonPathArray
cannot handle null. This is why we have jsonPathArrayDefaultEmpty
.Jackie
03/24/2022, 12:46 AMnon-null
value when input is null
Nizar Hejazi
03/24/2022, 8:37 PMJackie
03/24/2022, 8:41 PMNizar Hejazi
03/24/2022, 8:46 PM