Impromptu summary of a vision of the next steps for Open Source

Impromptu summary of a vision of the next steps for Open Source

On I posted some of the grants that I've written. Here I'd just like to sum up their key features.

  1. One observation is that Stack Exchange and other commons-based peer production sites are also agent-based. In this case, the agents are people who interact with each other online, subject to some constraints.
  2. This means that we could, in principle, build artificial agents who interact using the same framework, and subject to the same constraints.
  3. These agents would have to be regulated somehow. One possible anatomy would be for the agents to be rules-based. This means that under circumstance X, they do Y.
  4. Some of the rules are basic ones like, "You can post a question here, you post a comment there, etc."
  5. This becomes a little bit clever when we realise that Stack Exchange is itself an archive of rules. Given a question Q, the corresponding highly-voted answer A tells you what to do.
  6. Could we extract rules from the content of Stack Exchange that could then be used to govern artificial agents interacting on Stack Exchange (or on a separate Q&A platform)?
  7. Given such a platform, might it be useful for fairly general question-answering? For example, could agents use it to synthesise new software from high-level descriptions? Could they use it to generate custom learning materials to provide humans with on-demand upskilling? And...?

Since this represents what is probably a long-term research direction, how can we break it down into some modular steps, and do things with Stack Exchange content that are both directly useful and that contribute to this line of research?

  1. Given a question or answer on Stack Exchange, can we break it down into meaningful components?
  2. Given a question on Stack Exchange, can we identify related questions? Easier questions? All of the questions and answers that would be needed to synthesize an answer to the question?
  3. Given an issue on Github, similarly, can we identify the questions and answers on Stack Exchange that would be needed to synthesize pull request that closes the issue? (Notice that we can make use of historical data on the Github side for this too!)
  4. Can we explore the individual learning journeys of Stack Exchange users? Are they able to answer more difficult questions over time? Do they demonstrate other evidence of learning?
  5. Does the award of badges relate to users' contribution of value by any other metrics, such as 'depth' of the posts that they create? Are there other ways to generate formative feedback that could be used to train bots to add value on a similar site?
  6. What other sources of data might be needed to structure meaningful capabilities to bots? (E.g., software documentation, Wikipedia, etc.)

I'm sure these aren't all the questions that could be asked about this content, but this is a start: a half dozen each of long-term and somewhat more immediate questions to think about.


  • Is this related to TRIZ, a framework for problem solving?
  • Is this related to conceptual blending? I.e., I thought that I was going to be solving problems of type X, but now I have to do X'.
  • Pretty much everything in tech is a blend, so how do we decompose these things into their parts?
  • In engineering you overcome real-world problems, but software packages features in a way that restricts options!

Show Comments