FileHelpers.Dynamic을 사용하여 고정 폭 파일을 읽고 SQL에 업로드

좋아요, 가능한 한 최선을 설명하려고하겠습니다. 필자는 SQL 테이블을 사용하여 고정 너비 데이터 소스 구조 (헤더, 시작 인덱스, 필드 길이 등)를 정의하는 애플리케이션을 작성했습니다. 내 응용 프로그램이 실행되면이 테이블을 쿼리하여 ColumnName = header가있는 DataColumn 개체를 사용하여 DataTable 개체 (finalDT)를 만듭니다. 그런 다음이 테이블에 우리가 사용하는 모든 데이터 소스에있는 DataColumn 객체 집합을 추가합니다 (파생 열이라고도 함). 또한 자동 증가 정수인 기본 키 필드를 만듭니다. 원래 고정 폭 파일을 읽으려는 자신의 솔루션을 굴렸으나 이것을 FileHelper를 사용하여 변환하려고합니다. 주로 FileHelper가 구문 분석 할 수있는 다른 파일 형식 (CSV, Excel 등)에 액세스 할 수 있도록 통합하려고합니다.FileHelpers.Dynamic을 사용하여 고정 폭 파일을 읽고 SQL에 업로드

지금, 내 문제. 나는 고정 폭 데이터 소스 파일의 필드 정의를 저장하는 방법

private static FileHelperEngine GetFixedWidthFileClass(bool ignore) 
{ 
    singletonArguments sArgs = singletonArguments.sArgs; 
    singletonSQL sSQL = singletonSQL.sSQL; 
    List<string> remove = new List<string>(); 

    FixedLengthClassBuilder flcb = new FixedLengthClassBuilder(sSQL.FixedDataDefinition.DataTableName); 
    flcb.IgnoreFirstLines = 1; 
    flcb.IgnoreLastLines = 1; 
    flcb.IgnoreEmptyLines = true; 

    foreach (var dcs in sSQL.FixedDataDefinition.Columns) 
    { 
     flcb.AddField(dcs.header, Convert.ToInt32(dcs.length), "String"); 

     if (ignore && dcs.ignore) 
     { 
      flcb.LastField.FieldValueDiscarded = true; //If we want to ignore a column, this is how to do it. Would like to incorporate this. 
      flcb.LastField.Visibility = NetVisibility.Protected; 
     } 
     else 
     { 
      flcb.LastField.TrimMode = TrimMode.Both; 
      flcb.LastField.FieldNullValue = string.Empty; 
     } 
    } 

    return new FileHelperEngine(flcb.CreateRecordClass()); 
}

sSQL.FixedDataDefinition.Columns은 다음과 같습니다 FileHelper.Dynamic를 사용하여, 나는 다음과 같은 방법을 사용하여 FileHelperEngine 개체를 만들 수 있었다. 그때 수행하여 DataTable의 생성 : I 위에 표시된 GetFixedWidthFileClass() 방법의 결과를 유지하는 곳

DataTable dt = engine.ReadFileAsDT(file);

file는 고정 폭 파일의 전체 경로이다

및 engine이다. 이제 기본 키가없고 파생 된 열이없는 DataTable이 생겼습니다. 또한 dt의 모든 입력란은 ReadOnly = true으로 표시됩니다. 여기가 사물이 엉망이되는 곳입니다.

dt을 finalDT에 입력해야하며 기본 키 정보가없는 dt이 있어야합니다. 그런 일이 발생할 수 있다면 finalDT을 사용하여 내 데이터를 내 SQL 테이블에 업로드 할 수 있습니다. 그럴 수 없다면 finalDT에 기본 키가 없지만 SQL 테이블에 업로드하는 방법이 필요합니다. SqlBulkCopy 허용합니까? 다른 방법이 있습니까?

이 시점에서 필자는 FileHelper를 사용하여 고정 너비 파일을 구문 분석하고 그 결과가 내 SQL 테이블에 저장되는 한 처음부터 기꺼이 시작할 것입니다.

출처

2016-08-18 breusshe

알아 냈습니다. 그것은 꽤 아니지만 여기 그것이 어떻게 작동하는지입니다. 기본적으로 원래 게시물에 내 코드를 설정하는 방법은 GetFixedWidthFileClass() 메서드에서 아무 것도 변경하지 않았으므로 계속 적용됩니다.

/// <summary> 
///  For a given a datasource file, add all rows to the DataSet and collect Hexdump data 
/// </summary> 
/// <param name="ds"> 
///  The <see cref="System.Data.DataSet" /> to add to 
/// </param> 
/// <param name="file"> 
///  The datasource file to process 
/// </param> 
internal static void GenerateDatasource(ref DataSet ds, ref FileHelperEngine engine, DataSourceColumnSpecs mktgidSpecs, string file) 
{ 
    // Some singleton class instances to hold program data I will need. 
    singletonSQL sSQL = singletonSQL.sSQL; 
    singletonArguments sArgs = singletonArguments.sArgs; 

    try 
    { 
     // Load a DataTable with contents of datasource file. 
     DataTable dt = engine.ReadFileAsDT(file); 

     // Clean up the DataTable by removing columns that should be ignored. 
     DataTableCleanUp(ref dt, ref engine); 

     // ReadFileAsDT() makes all of the columns ReadOnly. Fix that. 
     foreach (DataColumn column in dt.Columns) 
      column.ReadOnly = false; 

     // Okay, now get a Primary Key and add in the derived columns. 
     GenerateDatasourceSchema(ref dt); 

     // Parse all of the rows and columns to do data clean up and assign some custom 
     // values. Add custom values for jobID and serial columns to each row in the DataTable. 
     for (int row = 0; row < dt.Rows.Count; row++) 
     { 
      string version = string.Empty; // The file version 
      bool found = false; // Used to get out of foreach loops once the required condition is found. 

      // Iterate all configured jobs and add the jobID and serial number to each row 
      // based upon match. 
      foreach (JobSetupDetails job in sSQL.VznJobDescriptions.JobDetails) 
      { 
       // Version must match id in order to update the row. Break out once we find 
       // the match to save time. 
       version = dt.Rows[row][dt.Columns[mktgidSpecs.header]].ToString().Trim().Split(new char[] { '_' })[0]; 
       foreach (string id in job.ids) 
       { 
        if (version.Equals(id)) 
        { 
         dt.Rows[row][dt.Columns["jobid"]] = job.jobID; 

         lock (locklist) 
          dt.Rows[row][dt.Columns["serial"]] = job.serial++; 

         found = true; 
         break; 
        } 
       } 
       if (found) 
        break; 
      } 

      // Parse all columns to do data clean up. 
      for (int column = 0; column < dt.Columns.Count; column++) 
      { 
       // This tab character keeps showing up in the data. It should not be there, 
       // but customer won't fix it, so we have to. 
       if (dt.Rows[row][column].GetType() == typeof(string)) 
        dt.Rows[row][column] = dt.Rows[row][column].ToString().Replace('\t', ' '); 
      } 
     } 

     dt.AcceptChanges(); 

     // DataTable is cleaned up and modified. Time to push it into the DataSet. 
     lock (locklist) 
     { 
      // If dt is writing back to the DataSet for the first time, Rows.Count will be 
      // zero. Since the DataTable in the DataSet does not have the table schema and 
      // since dt.Copy() is not an option (ds is referenced, so Copy() won't work), Use 
      // Merge() and use the option MissingSchemaAction.Add to create the schema. 
      if (ds.Tables[sSQL.FixedDataDefinition.DataTableName].Rows.Count == 0) 
       ds.Tables[sSQL.FixedDataDefinition.DataTableName].Merge(dt, false, MissingSchemaAction.Add); 
      else 
      { 
       // If this is not the first write to the DataSet, remove the PrimaryKey 
       // column to avoid duplicate key values. Use ImportRow() rather then .Merge() 
       // since, for whatever reason, Merge() is overwriting ds each time it is 
       // called and ImportRow() is actually appending the row. Ugly, but can't 
       // figure out another way to make this work. 
       dt.PrimaryKey = null; 
       dt.Columns.Remove(dt.Columns[0]); 
       foreach (DataRow dr in dt.Rows) 
        ds.Tables[sSQL.FixedDataDefinition.DataTableName].ImportRow(dr); 
      } 

      // Accept all the changes made to the DataSet. 
      ds.Tables[sSQL.FixedDataDefinition.DataTableName].AcceptChanges(); 
     } 

     // Clean up memory. 
     dt.Clear(); 

     // Log my progress. 
     log.GenerateLog("0038", log.Info 
         , engine.TotalRecords.ToString() + " DataRows successfully added for file:\r\n\t" 
         + file + "\r\nto DataTable " 
         + sSQL.FixedDataDefinition.DataTableName); 
    } 
    catch (Exception e) 
    { 
     // Something bad happened here. 
     log.GenerateLog("0038", log.Error, "Failed to add DataRows to DataTable " 
         + sSQL.FixedDataDefinition.DataTableName 
         + " for file\r\n\t" 
         + file, e); 
    } 
    finally 
    { 
     // Successful or not, get rid of the datasource file to prevent other issues. 
     File.Delete(file); 
    } 
}

그리고이 방법 : 다음 finalDT 제대로 설치 얻기 위해 두 가지 방법을 추가했다 dsGenerateDatasource 방법에 참조 된 (finalDT 삶을 데이터 집합) 때문에, 기본적으로

/// <summary> 
///  Deletes columns that are not needed from a given <see cref="System.Data.DataTable" /> reference. 
/// </summary> 
/// <param name="dt"> 
///  The <see cref="System.Data.DataTable" /> to delete columns from. 
/// </param> 
/// <param name="engine"> 
///  The <see cref="FileHelperEngine" /> object containing data field usability information. 
/// </param> 
private static void DataTableCleanUp(ref DataTable dt, ref FileHelperEngine engine) 
{ 
    // Tracks DataColumns I need to remove from my temp DataTable, dt. 
    List<DataColumn> removeColumns = new List<DataColumn>(); 

    // If a field is Discarded, then the data was not imported because we don't need this 
    // column. In that case, mark the column for deletion by adding it to removeColumns. 
    for (int i = 0; i < engine.Options.Fields.Count; i++) 
     if (engine.Options.Fields[i].Discarded) 
      removeColumns.Add(dt.Columns[i]); 

    // Reverse the List so changes to dt don't generate schema errors. 
    removeColumns.Reverse(); 

    // Do the deletion. 
    foreach (DataColumn column in removeColumns) 
     dt.Columns.Remove(column); 

    // Clean up memory. 
    removeColumns.Clear(); 
}

을, I dt.Copy()을 사용하여 데이터를 푸시 할 수 없습니다. 이 작업을 수행하려면 Merge()을 사용해야했습니다. 그렇다면 Merge()을 사용하기를 원했던 곳에서 루프와 ImportRow()을 사용해야했습니다. Merge()은 finalDT이므로 덮어 쓰고 있었기 때문입니다.내가 해결해야했다

다른 문제가 있었다 :

내가 ImportRow()를 사용할 때, 나는 또한 dt에서 PrimaryKey을 삭제해야합니다 그렇지 않으면 내가 중복 키에 대한 오류를 얻을.
FileHelperEngine 또는 FileHelpers.Dynamic.FixedLengthClassBuilder에는 무시하고 싶지 않은 지난 열의 점프에 문제가 있습니다. 그것은 그 (것)들을 전혀 인정하지 않으며, 따라서 나의 열 오프셋을 죽이고, 데이터 소스 파일 (FieldHidden 옵션 사용)에서 데이터를 읽는 방법에 대한 정확도를 읽거나 읽으며 아무렇지도 않게 열을 생성하지만, 데이터를로드하십시오 (FieldValueDiscarded 및 Visibility.Private 또는 .Protected 옵션 사용). 이것이 나를위한 의미는 engine.ReadFileAsDT(file)에 대한 호출 후에 dt을 반복하고 Discarded으로 표시된 열을 삭제해야한다는 것입니다.
FileHelper는 내 PrimaryKey 열 또는 처리 중에 모든 데이터 소스에 추가 된 다른 파생 된 열에 대해 알지 못하므로 dt을 메서드 (GenerateDatasourceSchema())에 전달하여 정렬해야했습니다. 이 메서드는 기본적으로 이러한 열을 추가하고 PrimaryKey가 첫 번째 열인지 확인합니다.

코드의 나머지 부분이 수정되어 열과 행이 필요합니다. 어떤 경우에는 각 행의 열에 대한 값을 설정하고 다른 행에서는 고객의 원래 데이터에 오류를 정리합니다.

꽤 좋지 않아서 길 아래로 더 좋은 길을 찾아 내기를 바랍니다. 누군가 내가 그 일을 어떻게했는지에 대한 의견이 있다면, 나는 그 말을 듣고 싶습니다.

출처

2016-08-19 16:20:57 breusshe

FileHelpers.Dynamic을 사용하여 고정 폭 파일을 읽고 SQL에 업로드

답변

관련 문제