如前文所说,scala的Source类自带各种输入,除了文件,也可以从url中获得数据.
最简单的是Source.fromURL
scala> scala.io.Source.fromURL("http://argcv.com").mkString
res1: String =
argcv | enjoy code, enjoy life.
....
但是这个并不很安全.之前遇到过个问题,某服务器有问题,然后我们某个后端会尝试从服务器连接,然后被挂起了,直到耗尽了我们的连接数.我们应该加个timeout.
我改进一些代码后,得到一个简易的function如下:
def fromUrlWithTimeout(url: String, timeout: Int = 1500): String = {
import java.net.URL
import scala.io.Source
val conn = (new URL(url)).openConnection()
conn.setConnectTimeout(timeout)
conn.setReadTimeout(timeout)
val stream = conn.getInputStream()
val src = (scala.util.control.Exception.catching(classOf[Throwable]) opt Source.fromInputStream(stream).mkString) match {
case Some(s: String) => s
case _ => ""
}
stream.close()
src
}
使用也很简单
scala> fromUrlWithTimeout("http://argcv.com",3000)
res2: String =
argcv | enjoy code, enjoy life.
...
或者设置一个很小的timeout,得到结果如下:
scala> fromUrlWithTimeout("http://argcv.com",100)
java.net.SocketTimeoutException: connect timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345)
....
若要下载文件,可以参考此处.