You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
packer/vendor/github.com/gocolly/colly
Adrien Delorme ee716d3f7e
up go mod, go mod vendor & go mod tidy
7 years ago
..
debug up go mod, go mod vendor & go mod tidy 7 years ago
storage up go mod, go mod vendor & go mod tidy 7 years ago
.codecov.yml up go mod, go mod vendor & go mod tidy 7 years ago
.travis.yml up go mod, go mod vendor & go mod tidy 7 years ago
CHANGELOG.md up go mod, go mod vendor & go mod tidy 7 years ago
CONTRIBUTING.md up go mod, go mod vendor & go mod tidy 7 years ago
LICENSE.txt up go mod, go mod vendor & go mod tidy 7 years ago
README.md up go mod, go mod vendor & go mod tidy 7 years ago
VERSION up go mod, go mod vendor & go mod tidy 7 years ago
colly.go up go mod, go mod vendor & go mod tidy 7 years ago
context.go up go mod, go mod vendor & go mod tidy 7 years ago
htmlelement.go up go mod, go mod vendor & go mod tidy 7 years ago
http_backend.go up go mod, go mod vendor & go mod tidy 7 years ago
request.go up go mod, go mod vendor & go mod tidy 7 years ago
response.go up go mod, go mod vendor & go mod tidy 7 years ago
unmarshal.go up go mod, go mod vendor & go mod tidy 7 years ago
xmlelement.go up go mod, go mod vendor & go mod tidy 7 years ago

README.md

Colly

Lightning Fast and Elegant Scraping Framework for Gophers

Colly provides a clean interface to write any kind of crawler/scraper/spider.

With Colly you can easily extract structured data from websites, which can be used for a wide range of applications, like data mining, data processing or archiving.

GoDoc Backers on Open Collective Sponsors on Open Collective build status report card view examples Code Coverage FOSSA Status Twitter URL

Features

  • Clean API
  • Fast (>1k request/sec on a single core)
  • Manages request delays and maximum concurrency per domain
  • Automatic cookie and session handling
  • Sync/async/parallel scraping
  • Caching
  • Automatic encoding of non-unicode responses
  • Robots.txt support
  • Distributed scraping
  • Configuration via environment variables
  • Extensions

Example

func main() {
	c := colly.NewCollector()

	// Find and visit all links
	c.OnHTML("a[href]", func(e *colly.HTMLElement) {
		e.Request.Visit(e.Attr("href"))
	})

	c.OnRequest(func(r *colly.Request) {
		fmt.Println("Visiting", r.URL)
	})

	c.Visit("http://go-colly.org/")
}

See examples folder for more detailed examples.

Installation

go get -u github.com/gocolly/colly/...

Bugs

Bugs or suggestions? Visit the issue tracker or join #colly on freenode

Other Projects Using Colly

Below is a list of public, open source projects that use Colly:

If you are using Colly in a project please send a pull request to add it to the list.

Contributors

This project exists thanks to all the people who contribute. [Contribute].

Backers

Thank you to all our backers! 🙏 [Become a backer]

Sponsors

Support this project by becoming a sponsor. Your logo will show up here with a link to your website. [Become a sponsor]

License

FOSSA Status