Hello there, I am trying to validate this schema ...
# troubleshooting
s
Hello there, I am trying to validate this schema tableConfig using the schema validate API. It keeps returning 404 with reason null. Can someone help me find the issue with this schema
Copy code
{
  "tableName": "lineorder_star_OFFLINE",
  "tableType": "OFFLINE",
  "segmentsConfig": {
    "timeColumnName": "LO_ORDERDATE", //date field with day-granularity
    "timeType": "DAYS",
    "replication": "1",
    "schemaName": "lineorder"
  },
  "tenants": {
    "broker": "DefaultTenant",
    "server": "DefaultTenant"
  },
  "metadata": {
    "customConfigs": {}
  },
  "tableIndexConfig": {
    "starTreeIndexConfigs": [
      {
        "dimensionsSplitOrder": [
          "LO_ORDERDATE", //date
          "LO_SUPPKEY", // dim field
          "LO_PARTKEY", // dim field
          "LO_DISCOUNT", // measure
          "LO_QUANTITY", //measure
          "LO_REVENUE", // dim field
          "LO_ORDERPRIORITY" //measure
        ],
        "skipStarNodeCreationForDimensions": [],
        "functionColumnPairs": [
          "SUM__LO_QUANTITY",
          "COUNT__LO_ORDERKEY",
          "SUM__LO_REVENUE"
        ]
      }
    ]
  }
}
n
Did the mean you are trying to validate this table config? Or did you post the table config by mistake? Are you using swagger? Can you share the entire request and the logs from controller?
s
Yes I am useing swagger to validate this config
Curl:
Copy code
curl -X POST "<http://localhost:9000/tableConfigs/validate>" -H "accept: application/json" -H "Content-Type: application/json" -d "{ \"tableName\": \"lineorder_star_OFFLINE\", \"tableType\": \"OFFLINE\", \"segmentsConfig\": { \"timeColumnName\": \"LO_ORDERDATE\", \"timeType\": \"DAYS\", \"replication\": \"1\", \"schemaName\": \"lineorder\" }, \"tenants\": { \"broker\": \"DefaultTenant\", \"server\": \"DefaultTenant\" }, \"metadata\": { \"customConfigs\": {} }, \"tableIndexConfig\": { \"starTreeIndexConfigs\": [ { \"dimensionsSplitOrder\": [ \"LO_ORDERDATE\", \"LO_SUPPKEY\", \"LO_PARTKEY\", \"LO_DISCOUNT\", \"LO_QUANTITY\", \"LO_REVENUE\", \"LO_ORDERPRIORITY\" ], \"skipStarNodeCreationForDimensions\": [], \"functionColumnPairs\": [ \"SUM__LO_QUANTITY\", \"COUNT__LO_ORDERKEY\", \"SUM__LO_REVENUE\" ] } ] }}"
Controller logs attached:
oh I was checking the wrong API: tested it
<http://localhost:9000/tableConfigs/validate>
Yet it just says invalid JSON. I am not sure as to what is field is causing it
m
Copy code
} exception: Unexpected character ('/' (code 47)): maybe a (non-standard) comment? (not recognized as one since Feature 'ALLOW_COMMENTS' not enabled for parser)
 at [Source: (String)"{
  "tableName": "lineorder_star_OFFLINE",
  "tableType": "OFFLINE",
  "segmentsConfig": {
it doesn't like the comments
if you take those out it'll be fine
d
In general JSON decoders don't accept comments.
n
Tableconfigs Api needs schema + table config in the json. You need to try with just the tables/validate api
s
okay thank you
@User i haven't used them in the API. it was just for explanation.
Thanks @User tables/validate API, validated the config correctly. However, I am stuck at the next step in the process. I tried ingesting data with smaller data files (sizewise like in KBs) and they are ingested properly. But now I have this CSV which is 35GB in size that I am trying to ingest. And it keeps failing. While ingestion, it creates the input/output folders, it creates star index file in output folder but somehow it fails to move the entire thing into the segments folder. curl -X GET "http://localhost:9000/debug/tables/lineorder_star?verbosity=0" -H "accept: application/json" this API returns
Copy code
"errorMessage" : "Cannot retrieve ingestion status for Table : lineorder_star_OFFLINE since it does not use the built-in SegmentGenerationAndPushTask task"
There are no controller logs for this process. It there a way to make the ingestion more verbose?
n
One file of 35g is likely the problem. If you use the launch data ingestion command, 1 file becomes 1 segment. Can you break that file up into smaller parts?
what do the logs on the output of the launch ingestion command say?
s
no errors it exits… i will try with smaller files. thank you 😊