Mapping co-ops and their relationships

harry · 22 May 2017 12:37

@asimong funnily enough years at the coalface of linked data have made me feel its better to collect and publish the data first and then worry about the format. The world has a hell of a lot of unpopulated schemas/ontologies and not enough instance data.

I like @mixmix’s idea of minimum viable data (though I think we might as well make the ‘500 NZD’ bit stuctured )

I’ve added Collaborations to the website (https://www.coops.tech/wp/wp-admin/post.php?post=2037&action=edit)

We’re just creating a pull request that will allow us to fetch the data out as JSON. It won’t be in JSON-LD but it will hold URIs for each co-op, etc. It would be fairly easy to add a JSON-LD endpoint if someone had a use case.

asimong · 22 May 2017 13:08

Harry @harry that’s great to hear that format is less important to people, one less thing to worry about… does that mean that, in your experience, the data is easy to transform from one format to another? I expect it is, if it follows any established linked data format, ultimately resolvable into triples / quads…

But thanks to your drawing attention to it, I think I didn’t express myself clearly. I’m absolutely not worried about matters like e.g. JSON-LD v. Turtle. So I’ll edit my post to replace the word “formats”, as I can see that means something I didn’t intend. In my (interoperability/portability) experience, what matters is the underlying – well, I can’t think of a better word than “ontology”, though that’s not everyone’s favourite word. (Goodness me, the Ontology Outreach Advisory is 10 years old already, where did that time go?)

OK, let’s get beneath the word “ontology”. I guess we share the recognition that having different data sets with similar, but different, and overlapping meanings can play havoc with interoperability or portability, and indeed the very linking of the data. Or don’t you think so?

What I mean is, can we agree on what our data represents, before trying to link it? That’s the essence of ontology, to me. If we are mapping co-ops and their relationships, it’s to say the least useful if we agree on what a co-op is, what properties and relationships they may have, etc. etc. Or, again, do you have a deeper view than this?

harry · 22 May 2017 13:28

In my experience even very well resourced organizations such as BBC and New York Times fail to unlock the interoperable potential of datasets and end up wasting a lot of time on issues of second degree reification, etc.

I’d suggest that we focus on publishing the data simply in plain JSON, just as Co-ops UK did. Someone can then map this to linked data concepts as @mattw did.

That way the project to collect and publish the data doesn’t need to be held up by the question of which ontology to use.

Or challenge we’ll face is trying to get people to type the right amount in the box that says “total project value - the amount going on the invoice excluding VAT” etc. If we can’t get that right with a very high degree of confidence then the finer subtleties of our chosen ontology will be irrelevant

asimong · 22 May 2017 13:35

Harry @harry, I’m trying to find common ground and understanding here – could I ask what you mean by this?

How could they, in your view?

You mention

Could I ask for some clarification here? What might be alternatives, in what situation?

I hope this is OK for people to go over these basics, and that it helps others as well as me!

harry · 22 May 2017 14:23

how could they in your view?

When I was at the BBC they spent millions of quid modelling the olympics as an ontology and publishing all the events, teams, venues, etc as linked data. It was all loaded into a triplestore and queried. The aim was to enable some sort of data sharing/new apps/new use cases but as far as I’m aware, nothing ever came of the investment.

a question of what ontology to use

I wouldn’t use an ontology as I’ve never seen the benefits. I’d use a standard schema with sensible labels e.g.

    "total_project_value": 20000, 
    "lead_partner": "http://coops.tech/co-ops/outlandish", 
    "project_partners": [
         "http://coops.tech/co-ops/open-data-services", 
         "http://coops.tech/co-ops/cetis"
     ]
}```

The whoever is doing the big data mashup can map our "total_project_value" to whatever makes sense in their context, but it's not our problem to sort it our for them.

mixmix · 22 May 2017 23:35

JUST START is the take-away. We can iterate IF it’s useful.
Worst case someone spends an afternoon hand-migrate some data - it’s not like we’re running a large hadron collider or anything

But on a serious note, I think we can all agree that the core standardisation we need to focus our energy on is whether we’re running or case formatting. (bikeshed troll)

Looks like an awesome start @harry !

djembejohn · 23 May 2017 10:42

Hi everyone,

A really interesting topic. Sorry this is a bit of a ‘long stream of conciousness’ post.

I agree with mixmix’s idea about just starting. Migrating data between formats is not a big job. Having said that, I have found that working with Python and Mongo has been good in the past for its flexibility of data storage, simplicity for linking with APIs, available tools for analysing data, and outputting all kinds of formats.

My approach to these things is to think about what questions you’d like to be answered. Perhaps I’ll get the ball rolling by putting a few out there and briefly putting some potential answers. I find a good approach is to put all the ideas out there at first, don’t worry if they’re not very feasible, someone else might come up with a nearby idea which is feasible. However, in a later stage it’s then good to choose the best/most feasible ones and run with them. But please take these ideas as simply being thought-provoking and feel free to critique/add to them.

What are the aims an objectives of the data collection? What’s the data for, why do we want to collect it?
+ These are always difficult questions to answer, which is why big corporations like the BBC can spend silly money on projects without having a clear reason as to why they’re doing it. I think the questions are important though. Sometimes “because we can” is a good enough answer, but I think it’s worthwhile to at least give this question some thought.
+ For me, I’m interested in the idea of understanding co-ops as a kind of ecosystem. Different co-ops have different relationships with one another, there are different ‘species’ of co-op that work in different niches, etc. I am interested in computer/mathematically modelling this ecosystem in some way, and data can always inform models. The larger goal would be that by understanding how the ecosystem works, we can adapt better to it.
What classes of data would we want to collect?
+ Online data - the online presences of the different co-ops.
+ Financial/trading data - how co-ops trade with other companies.
+ Geo-location data - where the co-ops have offices and outlets, etc.
+ Member/employee data
+ Temporal data - how these things have changed over time
What accessible data are currently out there?
+ I guess all cooperatives have some kind of Internet presence. One idea might be to make a collection of their websites. You could glean geolocation data from the sites and look at how the different sites link to one another. I have some thoughts about adapting a sampling algorithm, and network analysis tools, which I have already written, to do this.
+ A second kind of Internet presence might be on social media. I’ve done some work mapping Twitter accounts in the past and finding out how they link into groups (Twitter users forming tribes with own language, tweet analysis shows | Twitter | The Guardian). I could look into doing something along these lines with Twitter, for which I have quite a few API tools - however it might be quite limiting as it is linked mainly to Twitter. However, I’m always up for moving to other APIs. However, perhaps co-ops are often a little under the social media radar, so the social media approach might not be so great?
+ I guess companies house would have some records?
+ It might be possible to send out a questionnaire to co-ops to glean some of these data?
+ What have others already done which might be useful?

chris · 23 May 2017 20:27

I’ve posted a screenshot of the Collaboration admin interface to the wiki — it looks great

mattw · 7 June 2017 10:55

Very interesting, @djembejohn. I really hope we can collaborate in the not too distant future - we (Institute for Solidarity Economics) plan to work on data for the Solidarity Economy during the rest of 2017, extending the work we have already done in this area, and building on work done by others.

Here is a ‘Charter for Building a Data Commons for a Free, Fair and Sustainable Future’ that I helped to write, with others (Silke Helfrich, people from TransforMap, and RIPESS Europe) earlier this year:
https://discourse.transformap.co/t/charter-for-building-a-data-commons-for-a-free-fair-and-sustainable-future/1368

harry · 7 June 2017 12:47

I think it sounds like we’re all on the same page (or the same place on two different pages, as @mixmix points out).

Page 1: we all agree it would be good to start collecting data on CoTech collaborations, and agree that the WordPress thing I made is an ok starting point.
Page 2: we all agree that it would be cool if some stuff happened with open data, and that it might have some important long term implications for the economy and/or the commons.

Can I suggest we start moving towards some actions? Personally I’d be up for doing the following to move things forward:

Getting someone to deploy the updated version of the site to enable the JSON API
Adding all of Outlandish’s collaborations to date
Making any suggested refinements to the ‘data model’ that would make it more useful/accurate/etc.

Any other volunteers?

mattw · 7 June 2017 13:52

I definitely agree that starting is a good idea, and that gives us a platform from which to explore/learn/iterate, etc.
I do have my eyes on some of the long term aspirations, and there are a few very small things that could be done early to smooth the way (off the top of my head, I can think of only one!)

I’d find it easier to collaborate if we were tracking suggestions, issues, code, examples, documentation, etc on something other than discourse (much as I love discourse). I suggest creating a repository on github, and I’m happy to create one if there’s agreement (among those who want to work on this). If we want to go down this route, we should probably start a new thread to make it more visible within CoTech.

mixmix · 8 June 2017 05:15

and my axe !

I’m up for:

making a simple site which makes it easy to view the data from the API
entering future collaborations in

IMO anything that makes this data more alive gives this initiative a better chance.

Let me know when you’ve got an API I can consume up @harry

future dreaming (distraction) - I’ve friends who’ve worked in this domain a bit with things like holodex

harry · 8 June 2017 11:25

The code is already in github I believe: GitHub - cotech/website: The Cooperative Technologists WordPress website - see CoTech WordPress - CoTech for details

At the moment there is a bit of a bottleneck in that only @chris can deploy to the test and production servers at the moment (I believe). I’d like to be able to deploy at least to the test server - I believe we may need to join webarchitects to get access to their infrastructure?

mattw · 8 June 2017 11:48

Splendid! I suggest that a new repository is set up for work that is to do with data/api (the exact name may need some pondering depending on what we think is the scope of what we are doing). Then we have a place to raise issues, document usage, make suggestions, etc. Of course, some of these may require implementation in the website repo, but having a separate repo keeps things de-coupled, and some of the content in the data/api repo may have nothing to do with the website.

harry · 8 June 2017 12:44

Go ahead

Probably just easiest to start the repo on your own github and then get @chris or @olizilla to fork the repo into the cotech org github.

sdgluck · 8 June 2017 13:34

@mattw Send over your GitHub username and I will add you to the CoTech org.

olizilla · 8 June 2017 14:06

And my

I love to fork repos into other repos and am occasionally available for such duties. But be warned if you summon me, I may keep my self amused with self narrating photographs.

mattw · 8 June 2017 14:07

matt-wallis. Thank you

chris · 8 June 2017 15:11

@harry said:

At the moment there is a bit of a bottleneck in that only @chris can deploy to the test and production servers at the moment (I believe). I’d like to be able to deploy at least to the test server - I believe we may need to join webarchitects to get access to their infrastructure?

The issue is that the dev and live sites are running on shared servers and clients only have SSHFS / SFTP access to update their sites, however SSH access is needed due to the way this site is deployed, the way around this issue would be to have a virtual server just for these sites, then we could do exactly what we want, but we would also then have a extra server to maintain and pay for…

The current issue with the pull request from Matt Kendon is one I’m afraid I need help with, do you want to follow this up on that issue thread?

mixmix · 8 June 2017 23:48

Feel free to boo, but how about Heroku?

| pros                             | cons                 |
|----------------------------------|----------------------|
| fast                             | it's a middleman!    |
| easy management                  | costs more money     |
| practice spending money together |                      |

I honestly think spending together accelerates cohesion.