According to Books Online, “a LineString is a one-dimensional object representing a sequence of points and the line segments connecting them.” That is, a single row which can describe a complete path rather than needing many rows each containing just one point.
My earlier post deals with a way to get GPX files (the format used by most handheld GPSr units and by most smartphone position tracking apps) loaded into a SQL server table, but all I have ended up with is a list of points and the time I was at them. There’s no concept of travel, and I can’t use any simple methods to calculate the distance travelled or what other tracks I might have intersected with. Also, I have 2652 rows for a single day of walking, which I could condense to 1 row if I used LineStrings instead of points. I’m up for that and most client programs are going to prefer it too.

Creating a LineString is trivial using SQL Server’s spatial methods:
declare @line geometry;
set @line = geometry::STGeomFromText('LINESTRING(-0.01278 51.68764, -0.01400 51.74256)',0);
select @line;
The LineString is created from the list of coordinates in the format longitude[space]latitude in a comma separated list. You can include as many longitude/latitude pairs as you like in the list. So the challenge is to take the table of coordinates and pivot it into a text column which contains all of them in a single string. SQL Server’s PIVOT methods don’t provide a way of doing this and nor does the SSIS pivot transform. You could do it with a cursor in T-SQL but that’s not going to scale particularly well.
Much better, then, to do this at load time in SSIS using a script which can group the track points by their identifier and create the string of points that can be used for the LineString. Scripts generally scale really well and you’ve effectively got all of .NET at your disposal to accomplish the task.
The first step is to write a query that provides the data. The only vital thing is that you need to sort the results by your grouping key and in time order. Based on the simple schema I have loaded the points into I need to write code like this:
select
t.name as TrackName
,p.time as PointTime
,p.ele as PointElevation
,convert(varchar(11), p.lon) + ' ' + convert(varchar(11), p.lat) as Coordinate
from
Tracks t
inner join Segments s on
t.trk_Id = s.trk_Id
inner join Points p on
s.trkseg_Id = p.trkseg_Id
order by
t.trk_Id
,p.time
The script is going to work by adding the points to a string one by one, so they need to be in time order. It checks for changes in the track identifier (in this case it’s TrackName) to create each line. I’ve also stitched the longitude and latitude together into a single column with a space between them.
Because the script is going to take multiple incoming records and group them together to create a single output record, it needs to be an asynchronous script. To accomplish this, add a script transform to the data flow. On the Inputs and Outputs page, delete the Output 0 output and add a new output. Check that its SynchronousInputID property is set to None to make it an asynchronous script. Add the columns which you want to output to the script; at a minimum you’re going to need a column for your track identifier and a column for the LineString but I’m also adding a start and end time and a number of points. The LineString column should probably be created as a DT_TEXT type, because the DT_STRING type can only hold 4000 characters. DT_TEXT has to be worked with using binary methods in the script but at least it can hold as many characters as you like.

Each row flowing through the script could match one of three conditions:
- It could be the first row of the entire data set.
- It could be a row which belongs to the same track as the previous track.
- It could be a row which is part of a different track.
On cases 1 and 3, a new line needs to be started. On case 2, the points for the incoming row need to be added to the LineString currently being built. On case 3, the current LineString needs to be closed and a row written to the output, and a new LineString started. Finally, on the very last row of the data, the last LineString needs to be finished and a row written to the output.
Here’s the script that does the work. Notice that I am also keeping a record of the start time of the track, the end time, the name of the tracks, and the number of points that make up the track.
#region Namespaces
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;
using System.Text;
using Microsoft.SqlServer.Dts.Pipeline;
#endregion
[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{
public override void PreExecute()
{
base.PreExecute();
/*
* Add your code here
*/
}
public override void PostExecute()
{
base.PostExecute();
/*
* Add your code here
*/
}
bool FirstRow = true;
string Track = null;
DateTime Date = default(DateTime);
DateTime StartDate = default(DateTime);
string Coordinate = null;
string StartPoint = null;
int Points = 0;
StringBuilder Linestring = new StringBuilder("LINESTRING(");
public override void Input0_ProcessInput(Input0Buffer Buffer)
{
while (Buffer.NextRow())
{
Input0_ProcessInputRow(Buffer);
// After the first row has been processed, turn off the first row flag
FirstRow = false;
}
if (Buffer.EndOfRowset())
{
// Last row so finish off final callsign record
AggregatedTrackLinesBuffer.TrackName = Track;
AggregatedTrackLinesBuffer.LineStartTime = StartDate;
AggregatedTrackLinesBuffer.LineEndTime = Date;
AggregatedTrackLinesBuffer.NumberOfPoints = Points;
Linestring.Remove(Linestring.Length - 2, 2);
Linestring.Append(")");
AggregatedTrackLinesBuffer.TrackLine.AddBlobData(System.Text.Encoding.ASCII.GetBytes(Linestring.ToString()));
}
}
public override void Input0_ProcessInputRow(Input0Buffer Row)
{
// Rows are in track and date/time order. Need to add each pair of points
// to a string, separated by commas. Start a new string if the callsign changes.
if (FirstRow) // First row in data set
{
AggregatedTrackLinesBuffer.AddRow();
Points = 1;
Track = Row.TrackName;
Date = Row.PointTime;
StartDate = Row.PointTime;
Linestring = Linestring.Append(Row.Coordinate);
Linestring = Linestring.Append(", ");
Coordinate = Row.Coordinate;
StartPoint = Row.Coordinate;
}
else if ((!FirstRow) & (Track != Row.TrackName)) //New track
{
// Write the current variables out to the row
AggregatedTrackLinesBuffer.TrackName = Track;
AggregatedTrackLinesBuffer.LineStartTime = StartDate;
AggregatedTrackLinesBuffer.LineEndTime = Date;
AggregatedTrackLinesBuffer.NumberOfPoints = Points;
Linestring.Remove(Linestring.Length - 2, 2);
Linestring.Append(")");
AggregatedTrackLinesBuffer.TrackLine.AddBlobData(System.Text.Encoding.ASCII.GetBytes(Linestring.ToString()));
// Begin a new line
AggregatedTrackLinesBuffer.AddRow();
Points = 1;
Track = Row.TrackName;
Date = Row.PointTime;
StartDate = Row.PointTime;
Coordinate = Row.Coordinate;
StartPoint = Row.Coordinate;
Linestring.Clear();
Linestring.Append("LINESTRING(");
Linestring = Linestring.Append(Row.Coordinate);
Linestring = Linestring.Append(", ");
}
else if ((!FirstRow) & (Track == Row.TrackName)) // New point for same track
{
Points += 1;
Date = Row.PointTime;
Coordinate = Row.Coordinate;
Linestring = Linestring.Append(Row.Coordinate);
Linestring = Linestring.Append(", ");
}
}
}
Running the package with this script condenses the inbound rows to one row per track name:

The TrackLine column is DT_TEXT so the data viewer can’t show it, but it can be inserted into a varchar(max) column in SQL Server with no problem:

To render the LineString column as a geography or geometry spatial column you would use the appropriate syntax:

I have saved the database I used, including the track points, and the SSDT project with the SSIS and a couple of reports to show points and LineStrings in action here.